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PREFACE 


The  purpose  of  this  study  was  to  develop  a  software 
environment  for  conducting  statistical  analysis  of  exponential 
failure  models  interactively.  The  environment  was  successfully 
completed  and  thoroughly  tested. 

The  method  used  in  this  study  to  develop  the  interactive 
environment  can  be  used  to  extend  the  capabilities  of  this 
system  by  adding  the  ability  to  process  other  distributions  or 
unrelated  tasks. 

In  developing  this  thesis  effort  I  am  particularly  in¬ 
debted  to  my  thesis  advisor.  Dr.  Panna  Nagarsenker,  for  her 
continuing  patience  and  assistance  in  times  of  need.  1  also 
wish  to  thank  my  wife,  Nancy,  for  her  understanding  and  support. 


Mark  F.  Amell 
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^  ABSTRACT 

\x  ~j~J\eris- 

^  This  study  developed  a  software  environment  for  con¬ 
ducting  statistical  analysis  of  exponential  failure  models 
interactively-  Exponential  failure  models  will  work  for 
small  amounts  of  data  and  are  particularly  valuable  when 
gathering  large  amounts  of  data  is  not  practical,  such  as 
loss  of  lives  or  airplanes. 

The  design  of  the  system  down  to  the  individual  modules 
demonstrated  that  by  using  the  software  engineering  techniques 
explored,  an  environment  can  be  efficiently  created.  The 
approach  used  in  this  project  not  only  created  a  valuable 
tool,  but  it's  use  is  encouraged  in  further  developments  for 
other  distributions.  —  /&■ 
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I.  Introduction 


The  primary  task  of  this  thesis  effort  is  to  develop  a 
software  environment  for  conducting  statistical  analysis  of 
exponential  fail  *'e  models  interactively. 


Background 

Failure  rates  have  many  applications  and  uses*  for 
example,  .1.  the  weather  radar  system  on  a  747  aircraft  has  a 
mean  time  between  failure(MTBF)  of  1,140  hours.  Assuming  an 
exponential  time  to  failure  distribution  the  following 
questions  can  be  answered*  (1)  What  is  the  probability  of 
failure  during  a  four  hour  flight?  (2)  What  is  the  maximum 
length  of  a  flight  such  that  the  reliability  of  the  system 
will  not  be  less  than  0,99,  assuming  the  system  is  in 
continuous  operation  during  the  flight?  (3)  How  often  should 
the  weather  radar  system  on  a  747  aircraft  be  completely 
serviced? 


2.  A  military  vehicle  has  a  required  200  kilometer 
mission  reliability  of  97.5%.  (1)  What  is  the  vehicle's 

required  MTBF?  (2)  How  many  missions  can  be  run  before 
exceeding  a  1 0,1  chance  for  mission  failure? 


3.  Several  different  military  aircraft  have  equipment 
failures.  The  data  is  gathered  giving  intervals  in  operating 
hours  between  the  successive  failures  of  the  equipment  for 
eacu  aircraft  considered.  This  a  an  example  where  we  have 
several  series  rather  than  one.  A  point  of  obvious  interest 
concerns  , /nether  the  observations  differ  significantly 
between  different  aircraft.  This  is  the  point  of  primary 
interest  in  considering,  for  example,  matters  of  preventive 
maintenance . 

4.  Failure  times  of  several  electronic  computers  are 
recorded  and  if  the  Poisson  process  is  observed,  then  the 
interesting  point  to  discover  is  whether  all  the  computers 
have  the  same  average  failure  rate. 

5.  A  new  design  is  to  be  tested  and  if  it  is  judged  to 
be  superior  to  the  old  design  than  the  old  design  will  be 
replaced.  From  extensive  testing  over  the  years  the  old 
design  is  known  to  have  an  MTBF  of  1,250  hours.  If  the  new 
design  has  an  MTBF  that  is  5 0 %  better  than  the  old  design,  we 
desire  an  80$  chance  of  selecting  the  new  design.  If  the  new 
design  is  no  better  than  the  old  design,  we  desire  a  95/^ 
chance  of  not  selecting  the  new  design.  (1)  How  many 
failures  will  we  have  to  observe?  (2)  Assuming  that  the  new 
design  has  an  MTBF  of  1,565  hours  and  all  items  placed  on 
test  are  allowed  to  fail,  what  is  the  expected  time  of 


testing?  (3)  How  many  items  must  be  tested  to  obtain  a  }0ft> 
reduction  in  test  time? 


While  facing  o-’in  problems  a  manager  wi^:.  to  have  an 
automated  interactive  _  item  that  helps  to  answer  ouch 
questions.  The  objective  of  this  project  is  to  p:  such 

a  facility,  hence  to  create  a  software  environment  for 
conducting  the  statistical  analysis  described  below. 

The  problems  in  all  the  above  cases  reduce  to  testing 
the  hypothesis  that  p  samples  have  been  randomly  drawn  from 
the  same  exponential  distribution.  The  hypothesis  can  be 
tested  by  using  the  likelihood  ratio  criterion  and  for  p  -  2 
this  reduces  to  an  F  test.  Bartlett's  M  test  for  homogenity 
of  variances  in  samples  from  a  normal  population  can  also  be 
used  to  examine  this  hypothesis  for  a  general  p. 

The  hypothesis  can  be  tested  by  using  the  likelihood 
ratio  criteria,  it's  exact  distribution  and  percentage  points 
were  obtained  by  Dr.  Nagarsenker  in  1980,  a  review  of  which 
follows  J 

Suppose  that  p  samples  of  equal  size  n  are  available  and 
that  the  ith  sample  contains  observations  Xij  with  mean 

(i  =  1 . . .  j  =  l,...,n)  and  has  been  drawn  from  an 

exponential  distribution  with  an  exponential  distribution  of 


mean  0*  .  The  likelihood  ratio  criterion  for  testing  the 
hypothesis  He:Qi*  •••  -  QP  against  a  general  alternative  is 

V  L  -  [Or  x/*)J  (1) 

where  X  is  the  mean  of  the  combined  sample.  It  > an  be  easily 
shown  that 

e (l')  •  £  (x)‘A » 


If  we  use  the  inverse  transform  in  (2)  the  density  of  L  is 

where  Jl  -  r(fi.p)  •  Putting  **  ''  1  in  the 

integrand  on  the  right-hand  side  of  (3)  we  have  that 
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Lii-ing  the  well-known  expansion  for  Icj  /*  ( x  +  ^ )  t  we  obtain 
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where  ' 


and  the  coefficients  can  be  successively 
computed  using  the  following  recurrence  relations! 

<br  -  r~*  s.  Jk  $a  fr-jt  >  1 

J  (?) 


where  AV  .  (*  ')'(/-  prtl)/ £  r(rn  )}  .  Since  *)  .  we 

can  expand  it  as  a  factorial  series  in  the  from 
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where  the  coefficients  Rr  can  be  determined  using  the 
following  relations! 
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Using  term  by  term  integration,  we  have  the  exact 
distribution  of  L  in  the  form 
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v/here  Or -Qr  / (n  ( p -•)  and  I%(o/o)  is  the 

incomplete  beta  function. 


Problem  Overview 


The  system  created  for  this  software  environment 
consists  of  these  subsystems  described  as  follows.  Subsystem 
one  is  to  calculate  the  likelihood  ratio  1-value  from  the 
d-ila  points  given  by  the  user.  Subsystem  two  calculates  the 
theoretical  value.  Subsystem  three  will  compare  the  results 
from  subsystems  one  and  two  and  report  it*s  result. 

The  user  will  interact  with  the  system  by  answering 
questions  posed  by  the  computer.  At  the  start  of  the  system 
the  number  of  samples  and  the  sample  size  will  be  asked  for. 
The  user  can  then  test  for  multiple  alpha  levels  of  those 
samples.  All  the  user's  answers  will  be  first  tested  to  see 
if  they  are  meaningful j  then  the  data  will  be  processed. 

The  data  that  is  to  be  used  for  statical  analysis  nay  be 
read  from  a  data  file  or  obtained  interactively  from  the 
user.  One  of  the  interactive  questions  is  the  name  of  the 
file  where  the  data  is  stored.  Other  specifics  of  how  to  run 
the  system  are  contained  in  the  user  manual  in  Appendix  C. 

The  problem  was  attacked  by  breaking  down  the  entire 
system  into  several  subsystems.  Each  subsystem  was  complied 
separately  using  the  facilities  of  the  UNIX  operating  system. 
The  following  paragraphs  give  an  overview  of  UNIX. 


Unix  Operating  System 


The  following  paragraphs  describe  some  of  the  advantages 
and  disadvantages  of  using  the  UNIX  operating  system  to 
implement  and  use  a  program  as  developed  here.  Even  though 
the  computer  and  operating  system  to  be  used  in  this  project 
was  not  decided  by  this  study,  the  study  did  provide  a  better 
understanding  of  how  to  exploit  UNIX's  best  features. 

UNIX  was  written  by  Ken  Thompson  and  Dennis  Ritchie  for 
their  own  use  in  studying  operating  systems  and  other 
programming  projects.  They  succeeded  in  creating  a  system 
which  would  allow  them  to  easily  develop  programs.  UNIX  was 
developed  to  work  interactively  instead  of  in  the  more  common 
batch  approach.  A  good  text  processing  facility  and  text 
-liter  were  added  because  they  were  found  to  be  crucial  to 
high  productivity  of  programmers. 

A  flexible  file  system  contributes  directly  to 
increasing  a  programmer's  capabilities.  (1>  16 »  21 >  22)  The 
directory  system  is  stored  as  a  tree  structure  and  allows 
access  to  files  by  following  paths.  File  protection  is  very 
versatile  and  can  be  set  at  various  levels  from  a  -branch  of 
the  index  to  individual  files.  A  set  of  protection  flags 
includes  read,  write,  and  execute  flags  for  the  owner,  group 
members,  and  the  rest  of  the  users. 


The  operating  system  is  written  as  C  modules  which  form 
the  C  shell.  Commands  to  the  system  are  given  by  executing 
one  or  more  of  these  system  modules.  Having  direct  control 
over  the  operating  system  gives  to-  programmer  access  to  many 
powerful  touls.  One  tool  that  was  u~'l  heavily  in  this 
thesis  effort  -hs  Makefile.  Makefile  allows  modules  to  be 
compiled  independently  and  then  linked  together  to  form  a 
system.  In  addition,  Makefile  only  recompiles  modules  that 
have  been  changed  since  they  were  last  compiled,  therefore 
saving  unneccessary  processing  time. 

Although  UNIX  does  accomplish  its  main  goal  of  allowing 
the  programmer  to  easily  write  and  debug  programs,  the  user 
interface  is  not  very  friendly.  The  CAT  command,  short  for 
concatenation,  is  used  to  list  the  contents  of  a  file  on  the 
terminal.  The  Is  command,  which  is  used  to  list  directories 
of  files  at  the  terminal,  has  19  different  switches  that  can 
be  used  separately  or  in  combination.  These  shortcommings 
though,  which  just  take  time  to  learn,  are  more  difficult  for 
the  occasional  user  than  the  more  frequent  user.  (15) 

The  large  number  of  installations  using  UNIX  also  makes 
using  it  desirable  because  systems  created  on  UNIX  are  easily 
transported  to  other  computers  running  UNIX.  This  high 
portability  makes  systems  even  more  valuable  because  of  the 
large  number  of  users  that  can  access  the  system. 


Software  Engineering  Concepts 

Before  the  principals  of  structured  programming, 
modularization,  and  hierarchical  structure  are  used  to 
develop  the  the  design  of  .ha  software  environment,  ... 
terms  and  their  concepts  wxxx  Ke  discussed.  The  refei -  ,es 
given  are  for  those  interested  in  further  study  of  these 
software  design  methods. 

Structured  Programming 

"Structured  programming",  perhaps  the  most  abused  term 
in  the  modern  computing  literature,  derives  from  the  work  of 
Dahl,  Dijkstra  and  Hoare  (3).  Due  in  part  to  some 
unfortunate  remarks  in  Dijkstra' s  paper  (5),  some  people 
came  to  believe  that  structured  programming  was  any  one  of 
goto-less  programming,  stepwise  refinement,  topdown  design, 
or  programming  in  ALGOL-like  languages.  The  essence  of 
structured  programming,  which  is  expounded  below,  was 
temporarily  lost  to  the  literature. 

The  profound  aspects  of  structured  programming  concern 
the  use  of  techniques  for  reducing  the  complexity  of  the 
programming  problem.  By  emphasizing  the  "structure"  of 
algorithms,  programming  sections,  or  data  structures,  it 


becomes  possible  to  separate  the  behavior  of  the  program  at 
one  level  from  the  details  of  each  of  the  components.  Hence, 
for  example,  it  is  useful  to  refine  a  program  in  steps 
because  the  sko*.  ton  of  the  program  can  be  shown  to  behave 
correctly  given  r  ime  properties  of  the  (une  ■;  '<nded) 
subprograms.  Each  of  the  subprograms  can  then  be  in  turn,  in 
isolation  from  each  other  and  from  the  program  skeleton  in 
which  they  are  embedded. 

Stepwise  refinement  is  an  example  of  a  structured 
programming  technique.  It  is  a  tool  by  which  a  programmer 
can  record  one  aspect  of  the  complexity  of  a  program  in  order 
to  direct  attention  to  another  aspect.  The  process  of 
organizing  the  complexity  of  the  problem  is  accomplished  by 
the  programmer,  not  the  technique i  Other  important 
techniques  in  structured  programming  include  the  definition 
A  abstract  data  types  (25j  8j  11),  hierarchical  ordering  of 
program  segments,  and  use  of  verifiable  control  structures 
and  operators. 

Modularization 

Parnas  consolidated  the  informal  concepts  of 
modularization  (13).  Other  concepts,  such  as  seperate 
complilation,  are  not  now  included  under  this  term.  The 


primary  attribute  of  a  module  is  that  it  hides  a  unit  of 
information.  This  may  be  a  small  implementation  detail,  such 
as  the  actual  location  of  a  logical  device,  or  a  major  design 
decision,  as  in  whether  to  sort  a  KWIC  index  before  printing 
it  or  at  the  same  time(19). 

Modules  contain  and  hide  information.  Sometimes  they 
can  be  encoded  as  programs  or  program  segments,  but  not 
always.  For  example,  the  method  for  ensuring  data  integrity 
over  procedure  calls  (the  "calling  sequence")  can  be 
contained  in  a  module,  but  the  actual  method  may  not  be  the 
same  for  two  different  caller/callee  pairs.  The  syntax  of  a 
command  language,  for  example,  is  independent  of  the  means  by 
which  it  is  represented  in  a  program. 

The  recent  exploration  of  abstract  data  types  is  an 
important  contribution  to  the  process  of  modularization.  For 
many  purposes,  a  module  can  take  the  form  of  a  data 
representation  combined  with  a  set  of  permissible  operations 
on  that  representation.  Properly  described,  it  serves  as  an 
implementation  of  a  mathematical  object  that  can  be  discussed 
seperate  from  its  implementation.  This  permits  the  abstract 
behavior  to  be  made  available  to  a  program  that  used  the  data 
type  while  hiding  the  mechanisms  by  which  behaviors  are 
accomplished.  This  form  of  modularization  is  so  widely 
applicable  that  it  has  become  the  primary  emphasis  in  the 


design  of  several  new  programming  languages  (2j  ?j  9»  12 » 

24j  25). 

The  process  of  modularization  results  in  fragmentation 
of  the  system  representation.  When  systems  were  monolithic 
expanses  of  source  code,  recompiling  the  system  was  all  that 
was  needed  to  integrate  the  system  into  runable  form.  The 
problem  of  systematically  combining  modules  into  systems  is 
basically  unsolved,  although  the  Mesa  designers  have  had  some 
success  (9). 

Hierarchical  Structure 

The  division  of  systems  into  levels  is  most  often 
attributed  to  Dijkstra  (6).  Most  of  the  experience  in 
this  field  has  been  provided  by  operating  system  projects 
(10;  13}  14}  23).  Parnas  (17)  and  others  have  argued  for 
such  organization  of  other  systems. 

A  hierarchically  organized  system  has  an  enforced 
partial  ordering  on  some  objects  in  the  system.  Such  a 
system  may  be  described  as  a  sequence  of  levels,  each  level 
being  defined  as  the  class  of  objects  that  use  inferior 
(according  to  the  relation)  to  objects  in  higher  levels  and 
superior  to  those  in  the  lower  levels. 


Careful  selection  of  objects  and  relations  can  result  in 


a  system  with  some  very  nice  properties.  By  using 
"processes"  as  the  objects,  and  "provides  work  to”  as  the 
relation,  freedom  from  deadlock  in  the  system  follows 
immediately  (6).  If  "functions"  are  restricted  by  "calls" 
relations,  so  that  function  calls  always  decend  a  level  in 
the  system,  stack  depth  can  be  limited  to  a  known  maximum. 
And,  if  sets  of  functions  are  ordered  by  "functional 
dependency"  (10),  each  layer  corresponds  to  a  virtual  machine 
that  can  be  programmed  without  dependence  on  the  upper  levels 

Hierarchy  in  a  system  is  often  established  within  the 
information  hiding  lines.  Although  this  has  the  advantage  of 
simplicity,  difficulty  arises  in  real  systems  either  with 
establishing  a  hierarchy  at  all,  or  in  efficiently 
implementing  the  system  along  the  hierarchy  established. 

Since  the  two  concepts  are  not  dependent,  hierarchy  can  be 
established  "orthogonally"  to  modularization,  and  the 
benefits  of  both  structures  will  obtain  in  the  resulting 
system  (10). 

The  clear  specification  of  the  hierarchical  relation  is 


necessary  to  a  clear  understanding  of  the  system  structure 
that  will  result  (20). 


Design 

Each  system  was  developed  using  top-down  design 
methodology.  The  basic  principle  in  top-down  design  is  to 
work  from  an  abstract  functional  description  of  a  problem 
(top)  to  a  detailed  solution  (bottom).  (4iA50) 

The  separately  developed  modules  increase  the 
reliability,  maintainability,  and  adaptability  of  the  entire 
system.  The  system  is  reliable  because  each  module's  results 
can  be  easily  tested  against  hand  calculated  results. 
Maintainability  is  increased  by  the  modularity  of  the  system 
because  changes  or  improvements  to  the  system  can  be  easily 
changed  and  reverified  within  the  module.  Modularity  also 
increases  adaptability  of  the  system,  a  feature  which  is 
important  for  future  maintenance  and  additional  applications. 

Each  subsystem  will  be  further  designed  using  top-down 
methodology.  The  subsystems  will  be  implemented  on  a  Vax 
II/78O  using  the  tools  of  the  UNIX  operating  systems  to 
facilitate  the  development  of  entire  system. 
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Sequence  of  Presentation 


The  main  theme  thoughout  this  thesis  is  to  develop  a 
system  which  provides  a  comfortable  environment  for  the  above 
described  statistical  analysis.  Chapter  2,  entitled  "Project 
Design”  describes  the  subsystems  which  make  up  the 
environment  being  created  in  this  thesis  effort.  Chapter  3 
describes  the  algorithms  in  detail  for  each  of  the  modules 
which  make  up  each  system.  Chapter  4  illustrates  the  methods 
used  to  test  the  system  and  summarizes  the  test  results  to 
demonstrate  the  accuracy  of  the  test  data.  Finally,  chapter  5 
summarize-s  the  results  and  makes  suggestions  for  further  work 
in  this  area. 

Appendix  A  contains  all  the  source  listings.  Appendix  B 
contains  detailed  test  results,  and  Appendix  C  is  a  users’ 
manual  for  the  system  created. 


II.  Project  Design 


Top-Level 

The  first  step  in  the  design  process  is  to  interpret 
the  overall  requirement,  described  in  chapter  one,  into  a 
series  of  steps,  each  step  being  defined  by  one  functional 
statement.  Each  step  is  then. further  refined  until  the  steps 
at  the  lowest  levels  can  be  easily  programmed  into  a  self- 
contained  group  of  computer  instructions  often  referred  to  as 
a  procedure  or  function.  The  decision  as  to  when  to  stop 
refining  is  usually  a  natter  of  judgement  by  the  designer, 
but  some  of  the  factors  to  consider  are  the  reusability, 
:ompiexity,  and  length  of  the  modules  being  created. 

The  overall  project  requirements  are  as  follows.  Given 
the  number  of  exponential  populations,  size  of  the  samples, 
and  the  sample  values,  the  overall  system  must  perform  the 
statistical  analysis  necessary  to  test  the  hypothesis  that 
the  given  populations  are  identical. 

As  analyzed  in  the  first  chapter  the  overall  system  is 
divided  into  three  subsystems.  The  requirements  for  system 
one  is  to  calculate  the  iixlehood  ratio  1-value  from  the 
sample  values  provided  by  the  user. 


System  two  is  required  to  calculate  the  theoretical 
percentage  points  needed  for  the  analysis.  System  three's 
requirements  are  defined  as  performing  the  final  analysis  by 
using  the  results  from  systems  one  and  two.  Finally,  a 
driver  module  is  designed  which  would  input  the  needed 
responses  from  the  user,  execute  systems  one,  two,  and  three 
and  report  the  final  result  to  the  user. 

System  one  is  contained  in  the  procedure  lmodule.  The 
only  procedure  lmodule  needs  to  call  is  getreal.  The  general 
purpose  procedure  getreal  acts  as  an  interface  between  the 
computer  program  and  the  user,  getreal  allows  the  user  to 
enter  the  necessary  data  for  the  program. 

The  main  module  thesis  would  input  the  needed  responses  from 
the  user,  execute  systems  one  and  two,  and  finally  output  the 
results . 

System  two  will  use  Newton’s  iterative  method  to 
calculate  the  theoretical  value.  Module  newton  requires  R 
coefficients.  The  R  coefficients  are  obtained  by  calling 
procedure  modR  which  does  the  actual  calculations.  Finally 
newton  needs  to  call  a  probability  function  called,  prob.  The 
reason  for  introducing  the  modules  modR  and  prob  this  early 
is  discussed  in  the  following  paragraph. 


The  nodule  modR  is  called  from  the  main  module  thesis 
because  this  allows  many  calls  to  the  module  newton  with 
different  alpha  levels  without  recalculating  the  R 
coefficients.  The  probability  function  chosen  does  not 
converge  making  the  number  of  terms  needed  to  be  infinite, 
although  the  function  does  come  close  to  convergence  after 
only  a  relatively  few  terms.  The  main  module  determines  the 
number  of  terms  needed  by  calling  the  probability  functions  a 
number  of  times  and  summing  the  results  until  the  sum  is 
close  enough  to  one. 

The  user  inputs  needed  for  the  overall  project  are  the 
number  of  samples  taken,  the  size  of  each  sample,  if  the  data 
is  stored  in  a  text  file  the  name  of  the  file,  and  the  alpha 
level  to  be  tested  for.  The  alpha  level  is  one  minus  the 
probability  that  the  final  result  given  by  the  program  is 
accurate.  The  size  of  each  sample  must  be  the  same, 
therefore  the  sample  size  is  asked  for  only  once.  If  the 
number  of  samples  and  sample  size  has  not  changed  the  user  is 
allowed  to  test  for  different  alpha  levels  without  waiting 
for  the  re-execution  of  system  one  and  parts  of  system  two 
that  do  not  change,  for  example  the  R  coefficients  calculated 
i-'  the  module  modR  are  independent  of  the  alpha  level  being 
tested  for.  The  general  purpose  routine  getreal  can  be  used 
to  input  any  numbers. 


System  three  compares  the  results  from  systems  one  and 
two  and  reports  the  results.  System  three  must  call 
procedure  lmodule,  which  represents  system  one,  and  procedure 
newton,  which  is  the  interface  to  system  two.  The  driver 
module  and  system  three  are  both  contained  in  the  main 
procedure  thesis. 

Before  the  design  becomes  too  detailed  an  understanding 
of  the  people  who  are  going  to  use  the  system  is  needed.  The 
assumption  made  for  this  project  is  that  the  users  understand 
the  statistics  involved  but  are  not  necessarily  familiar  with 
this  project  or  computers.  The  process  of  designing  a 
urogram  that  is  easy  to  use  for  an  expected  group  of  users 
comes  under  the  general  term  of  "user  friendly”. 

In  the  above  discussion  of  the  top  level  the  rather 
general  requirements  given  in  chapter  one  have  been 
transformed  into  logic  of  the  thesis  module.  The  thesis 
module  will  be  actually  coded  after  chapter  three's  detailed 
design,  but  the  necessary  modules  are  beginning  to  be  defined 
and  are  further  illustrated  in  Figure  1. 


The  main  module  thesis  calls  system  one  and  passes  all 
necessary  information  to  it.  System  one,  which  calculates  the 
1  value,  only  needs  to  know  the  name  of  the  file  where  the 
inputed  data  is  stored,  how  many  samples,  and  the  size  of 
each  sample.  The  only  module  system  one  needs  to  call  is  the 
module  getreal  to  read  in  the  data  one  number  at  a  time.  The 
1  value  is  calculated  as  described  above  by  taking  the 
product  of  the  sample  means  and  dividing  by  the  grand  mean 
raised  to  the  nth  power  where  n  is  the  number  of  samples. 

System  Two 

System  two,  which  calculates  the  theoretical  value,  is 
far  more  complicated  and  requires  many  more  modules.  The 
first  breakdown  of  system  two  is  into  three  steps.  Step  one 
is  to  calculate  the  R  coefficients  by  calling  nodR  and 
specifing  the  number  of  coefficients  required.  By  knowing 
the  number  of  samples  and  the  sample  size  of  each,  the  main 
module  thesis  can  estimate  a  maximum  number  of  coefficients 
required  from  nodR. 

The  next  step  is  to  determine  the  number  of  terms 
necessary  to  have  the  probability  returned  by  module  prob 


equal  to  one  when  the  variable  x  is  set  to  one.  This  is  done 
because  the  function  is  defined  to  have  an  infinite  sun  of 
terns,  but  the  function  is  known  to  converge  close  to  one 
after  only  relatively  few  terns.  The  probability  is 
calculated  to  seven  places  of  accuracy  since,  after 
considering  the  accuracy  of  systen  one's  caculation  of  the  1 
value  and  the  precision  of  the  conputer,  seven  places  of 
precision  was  determined  to  be  sufficient  to  calculate  the 
number  of  terns  needed  by  the  nodule  newton. 

The  last  step  of  system  two  is  to  calculate  a 
theoretical  value  to  compare  system  one's  1  value  with  by 
using  newton's  iterative  method  described  in  chapter  three's 
discussion  of  module  newton.  The  nodule  newton  returns  the 
. ..eoretical  result  systen  two  was  created  for. 

To  calculate  the  R  coefficients  nodR  calls  nodules  Queue 
and  createCjr  for  the  Q  and  Cjr  coefficients  respectivly. 
Queue  in  turn  calls  ArCoef  to  calculate  the  Ar  coefficients 
needed  by  nodule  Queue.  Module  createCjr  calls  createAjr  for 
the  Ajr  coefficients  it  needs. 

Both  modules  ArCoef  and  createAjr  call  nodule  bernpoly 
to  calculate  a  value  for  a  specific  x  calculated  by  a 
specific  bernoulli  polynomial.  Module  bernpoly  calls  nodules 
b  and  com.  Module  b  calculates  bernoulli  numbers  and  con 
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igure  Z  modR 


calculates  the  number  of  possible  combinations  as  described 
in  elementary  probability.  Module  b  also  calls  module  com. 
The  breakdown  of  modR  is  further  illustrated  in  Figure  2. 

The  probability  function  prob  needs  to  call  functions 
loggam  and  ibeta.  Function  ibeta  calls  function  beta,  which 
in  turn  calls  function  loggam. 

The  module  newton  uses  Newton's  iterative  method  to 
calculate  the  theoretical  value  for  comparision  with  system 
one's  result.  Module  betap  calculates  a  starting  value  for 
newton  and  module  prob  is  called  until  the  theoretical  value 
is  obtained.  Module  prob  is  described  in  the  previous 
paragraph  and  module  betap  only  needs  to  call  function  ibeta. 

The  numerical  results  obtained  by  the  two  subsystems  are 
compared  with  each  other  at  a  preassigned  significance  level 
and  the  result  of  this  comparision  is  made  available  to  the 
user.  The  message  will  either  bei  (1)  there  is  sufficient 
evidence  that  the  data  does  not  come  from  the  same  population 
or  that  (2)  all  the  data  does  come  from  the  same  population. 
,'he  breakdown  of  system  two  is  illustrated  in  Figure  3» 


* 

* 


thesis 
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igure  3  System  2 


System  Three 


Syst'=>-.  three,  as  discussed  earlier,  is  contained  in  the 
main  nodule  thesis.  The  nodule  thesis,  along  with  all  the 
other  modules,  ..’11  be  discussed  in  detail  in  chapter  three. 

As  can  be  seen  from  this  example  a  complicated  project 
can  be  easily  broken  down  to  single  function  nodules.  The 
modules  can  then  be  individually  designed  and  coded.  The 
design  of  the  individual  modules  is  described  in  detail  in 
chapter  three.  An  additional  side  affect  of  modular 
programming  is  that  many  modules  can  be  reused,  for  example 
getreal,  in  the  same  project  or  future  projects. 


III.  Detailed  Design 


General  Guiu^lines 

The  next  fe-v  paragraphs  explain  some  -  ^  the  design 
characteristics  co;:...j'i  to  all  of  the  modules.  Modules, 
routines,  and  procedures  all  refer  to  a  self  contained  sequence 
of  related  computer  statements  which  perform  a  single  task. 

Although  written  to  run  on  a  Vax  11-780  computer  with  a 
Unix  operating  system,  thesis  could  run  on  most  systems  with 
few,  if  any,  modifications.  These  routines  should  also  work 
with  most  Pascal  compilers  due  to  standardization  of  the  basic 
keywords  used  in  these  procedures. 

Global  arrays  were  used  to  store  most  of  the  coefficients 
calculated  in  system  two.  In  general  the  use  of  global 
variables  is  discouraged  because  of  the  difficulty  in  debugging} 
but,  in  this  particular  set  of  modules  the  use  of  global  arrays 
were  found  to  be  necessary.  Global  arrays  were  used  in  this  set 
of  programs  because  the  same  coeff icients,  at  the  lower  levels, 
would  have  to  be  recalculated  thousands  of  times  which  would 
result  in  a  very  slow  overall  response  time.  For  example,  the 
user  may  have  to  wait  two  hours  for  a  response  that  now  takes 
less  than  five  minutes  using  global  arrays. 


Getreal 


Getreal  is  a  general  purpose  Pascal  procedure  which 
transfers  real  numbers  from  a  text  file  to  a  Pascal  program. 
Anyone  with  a  need  to  process  real  numbers  in  Pascal  could  use 
this  routine.  Getreal  requires  two  parameters  passed  to  it  when 
called  by  a  program. 

NUM  contains  the  real  number  read  in  by  Getreal  and  the 
boolean  variable  FLAG  reflects  the  status  of  that  read.  If  the 
number  read  in  contains  a  character,  Getreal  sets  FLAG  to  TRUE 
to  indicate  to  the  calling  program  that  Getreal  found  bad  data. 
Getreal  reads  the  number  in  three  sequential  steps.  Deleting  the 
leading  blanks  and  determining  the  sign  of  the  number  precedes 
the  last  two  steps  of  reading  the  integer  and  decimal  parts  of 
t'-.c  number.  A  blank  or  end-of-line  character  indicates  the  end 
•:f  the  number.  Before  discussing  these  three  steps  in  detail, 
the  following  paragraph  describes  the  internal  variables 
required  by  Getreal. 

As  previously  mentioned,  in  addition  to  the  variables 
passed  to  and  returned  from  the  procedure,  Getreal  requires  the 
internal  variables  SIGN,  CH,  I,  and  DECI1AL.  The  integer 
variable  SIGN  evaluates  to  either  a  plus  or  minus  one  depending 
on  the  sign  of  the  number  read  in.  While  Getreal  reads  the 
number,  7H  temporarily  holds  each  digit  of  the  number  in 


character  form.  The  real  variable  I  contains  the  degree  of 
precision  of  the  most  significant  decimal  place  as  Getreal  reads 
the  number.  The  boolean  flag  DECIMAL  tells  the  procedure  which 
side  of  the  decimal  point  the  digits  come  from. 

This  first  step  deletes  any  leading  blanks  and  sets  the 
variable  SIGN  to  the  sign  of  the  number  returned.  If  this  step 
does  not  find  a  minus  or  plus  sign,  a  digit,  or  a  decimal  point, 
FLAG  returns  to  the  calling  program  as  TRUE  to  indicate  an 
error.  SIGN  is  initialized  to  positive  one.  Finding  a  digit  or 
decimal  point  first  implies  a  positive  number  and  SIGN  remains 
equal  to  positive  one.  If  an  end-of-line  condition  exists  from 
a  previous  call,  Getreal  executes  a  READLN  (read  line  command) 
to  skip  to  the  beginning  of  the  next  line  of  input  data.  A 
ILE  loop  (3  programming  loop  that  continues  to  re-execte  a 
rove  of  statements  while  a  condition  is  true)  then  reads  blanks 
jr.e  at  a  time  until  it  encounters  a  non-blank  character, 
depending  on  the  non-blank  character  read,  Getreal  either  sets 
NUM  to  the  digit,  DECIMAL  to  TRUE  for  a  decimal  point,  SIGN  to 
the  appropriate  value,  or  if  none  of  the  above,  FLAG  to  TRUE  to 
indicate  an  error  condition. 

The  integer  part  of  the  procedure  uses  a  WHILE  loop  to  read 
in  one  digit  at  a  time  until  it  reaches  an  end-of-current  line, 
a  blank,  bad  data,  or  a  decimal  point.  Getreal  logically 
executes  the  integer  step  between  the  sign  determination  and 


decimal  steps  as  they  appear  in  the  real  number  itself,  Getreal 
converts  the  characters  read  to  digits  by  subtracting  the 
ORD('O')  (ordinal  value  of  the  character  zero)  from  the  0 RD(CH). 
Because  the  computer  stores  digits  in  patterns  of  zeros  and  ones 
(ordinal  values)  in  numeric  order,  the  offset  Arom  zero  gives 
the  value  of  the  digit.  Getreal  then  adds  the  digit  to  ten  times 
the  previous  partial  number.  As  an  example  of  the  basic  process 
consider  why  5319  equals  ( ( ( ( 5* 1 0)  +  3 ) *  1 0 )  + 1 ) *  1 0+9 .  After  finding 
a  decimal  point  in  the  input  stream  of  characters,  Getreal 
starts  the  final  step  of  reading  the  decimal  part. 

In  this  final  step  Getreal  reads  in  the  decimal  part 
of  the  real  number.  A  WHILE  loop  reads  in  the  digits  one  at  a 
time  until  it  encounters  a  non-digit  character.  Converting  the 
characters  read  in  to  their  numeric  form  occurs  in  the  sane  way 
as  escribed  in  the  integer  part.  The  variable  I  holds  the 
current  precision  of  the  last  digit  read  in.  Appending  the 
digits  to  HUT!  in  this  section  takes  place  by  dividing  the  digit 
by  I  and  adding  the  quotient  to  NUM.  As  an  example,  consider  why 
•739  equals  (7/10)  +  (3/100)  +  (9/1000).  The  final  statement  in 
Getreal  multiplies  NUM  by  SIGN  in  order  to  return  a  negative 
number,  if  appropriate. 

Getreal  is  an  efficient  general  purpose  routine  to  read  a 
real  number  from  a  text  file  into  a  Pascal  program.  Pascal  has 
limited  input  and  output  capabilities,  and  therefore  relies  on 
user-written  routines.  Pascal  implements  faster  and  mere 
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efficient  assembler  macros j  however,  importance  of  error 
detection  often  supercedes  this  advantage  in  speed.  If  Getreal 
detects  an  error  it  simply  gives  a  warning  and  skips  the  rest  of 
the  current  line  of  data  instead  of  abruptly  ending  the  program. 
Although  many  alternatives  exist,  the  one  presented  compares 
favorably  in  efficiency  and  ease-of-use  to  most  others.  The 
steps  of  Getreal  correspond  to  the  format  of  the  real  number 
read.  Getreal  will  also  work  for  integers,  and  the  real 
fractions  inputed  can  be  of  the  from  O.05  or  .05. 

1 module 

The  function  lmodule  takes  in  the  input  data  and  calculates 
tne  1  value  for  output.  The  1  Value  is  calculated  by  taking  the 
roduct  of  the  sample  mean  and  dividing  by  the  grand  mean  raised 
a  the  nth  power  where  n  is  the  number  of  samples  in  the  inputed 
data.  The  equation  is  shown  below 


i  re 

n  =  number  of  samples 
3  =  individual  sample  means 


If  running  the  program  interactively  the  user  will  be 
prompted  by  the  count  of  numbers  in  the  sample,  and  the  user 
will  be  able  to  reenter  the  number  if  bad  data  is  entered 
mistakenly.  The  batch  version  will  end  with  a  bad  data  message 
if  bad  aata  is  in  the  file.  Both  versions  except  positive 
real  numbers. 

The  parameters  inputed  into  lnodule  are  filename,  s,  and  n. 
filename  contains  the  filename  of  where  the  input  data  is  stored 
or  the  string  "input"  to  indicate  that  the  data- will  be  entered 
interactivly .  s  is  the  sample  size  and  n  is  the  number  of 
samples.  The  number  of  data  points  needed  can  be  determined  by 
multiplying  s  times  n. 

The  function  lmodule  outputs  the  1  value  which  is  later 
compared  to  the  theoretical  value  returned  by  the  function 
newton. 

The  first  step  of  lmodule  is  to  determine  the  sample  means 
of  each  of  the  samples  and  multiply  them  all  together.  The  next 
step,  which  is  performed  at  the  same  time,  is  to  calculate  the 
grand  mean.  Finally,  the  1  value  is  calculated  by  dividing  the 
product  of  the  sample  means  divided  by  the  grand  mean  as  shown 
in  equation  (12)  above. 


1  ~> 


The  com  module  is  a  function  which  returns  the  number  of 


combinations  obtained  from  n  objects  taken  r  at  a  time.  This 
result  is  also  known  as  the  binomial  coefficient.  The  binomial 
coefficients,  used  by  the  function  bernpoly,  are  calculated  by 
the  equation 


/n  \  _nJ - 

(  r  /  r  1  (n-r)  1 


where 

r  £=  n 

n  !  =  1*2*3*... *n 
0  !  =  1 

Once  calculated,  the  result  is  returned  to  the  function  bernpoly 
or  procedure  b  for  it's  use. 

The  parameters  imputed  to  the  function  com  are  integers  n 
and  r.  Ther  are  no  parameters  outputed  from  module  com,  but  the 
function  does  return  the  real  result  described  by  equation  (13) 


shown  above. 


The  module  com  doesn't  need  to  call  any  other  modules,  all 
the  steps  required  to  obtain  the  result  are  contained  within  the 
module. 

Two  temporary  internal  varibales  are  used  by  com.  The 
integer  j  is  used  as  a  loop  counter  and  the  real  variable  prod 
is  used  to  hold  intermediate  results. 

The  structure  of  the  module  com  is  two  loops.  The  first 
loop  multiplies  prod,  which  holds  the  intermediate  results, 
times  the  quotient  of  n  divided  by  r.  Each  time  through  the 
loop  n  and  r  are  both  reduced  by  one.  The  first  loop  ends  when 
r  is  equal  to  zero.  The  second  loop  is  similar  to  the  first 
except  that  instead  of  r  the  internal  variable  j,  which  had  been 
initialized  to  n  minus  r,  is  used.  Finally,  the  function 
returns  the  value  of  prod. 

There  are  more  efficient  ways  to  program  the  way 
combinations  are  calculated,  but  the  way  com  was  programed  is 
not  much  slower  and  does  avoid  overflow  problems  associated  with 
other  methods.  Combinations  and  binomial  coefficients  are  basic 
to  elementary  probability  which  increases  this  modules 
usefulness  outside  this  thesis  effort. 


3^ 


b 


The  b  nodule  is  a  procedure  which  calculates  the  bernoulli 
numbers  that  are  used  by  the  bernpoly  module.  The  bernoulli 
numbers  are  calculated  by  the  equation 


(V'K  ,  t.  - 


where 


b  ( )  =  bernoulli  numbers  being  calculated 
com  ( )  »  calls  to  the  com  module 


ur.ce  calculated,  the  coefficients  are  stored  in  the  global  array 
b  for  the  module  bernpoly  to  use. 


The  only  parameter  inputed  into  module  b  is  n,  the  number 
of  bernoulli  numbers  required  by  bernpoly.  There  are  no 
parameters  outputed  from  module  b,  all  the  results  are  stored  in 
global  array  b. 


Module  b  does  not  need  to  call  any  other  modules.  All  the 
required  calculations  needed  to  create  bernoulli  numbers  are 
contained  within  tne  procedure. 
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The  temporary  internal  variables  used  are  j,i,  and  sum. 

The  integers  j  and  i  are  sued  as  loop  counters.  The  outside 
loop,  which  ranges  form  one  to  n,  is  controlled  by  j.  The 
inside  loop,  which  which  ranges  from  zero  to  j  minus  one,  is 
controlled  by  i.  The  real  variable  sum  holds  the  summation  of 
the  intermediate  results  of  the  inner  loop  and  as  shown  in 
equation  (14)  above. 

The  first  step  of  module  b  is  to  set  the  global  variable  b 
flag  to  true,  b  flag  is  set  to  indicate  that  module  b  ha3 
already  been  called  once  and  doesn't  need  to  the  called  again 
for  the  rest  of  the  program.  Bernoulli  numbers  are  constants 
and  their  recalculation  would  needlessly  slow  the  overall 
program  down.  The  next  operation  is  to  initialize  b[0]  to  one. 
The  rest  of  the  module  consists  of  two  loops,  one  inside  the 
other.  The  outside  loop  counter  j  represents  the  current 
Bernoulli  number  being  calculated.  The  inside  loop  sums  all  the 
terms  required  to  calculate  each  bernoulli  number. 

b  is  an  efficient  procedure  to  calculate  bernoulli  numbers 
required  by  the  function  bernpoly.  The  resulting  numbers  are 
stored  in  the  global  array  b. 


bempoly 


The  bernpoly  module  is  a  function  that  solves  a  bernoulli 
polynomial  for  a  specific  value  given  the  number  of  terms.  This 
function  is  used  by  both  modules  ArCoef  and  CreateAjr  in  their 
calculations.  The  general  equation  is 


•  es  r\  po 
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where 

n  =  the  number  of  terms 
k  =  the  kth  term 
x  -  the  value  solved  for 
b  =  the  bernoulli  numbers 
calculated  in  module  b. 

Once  the  equation  is  evaluated  for  a  specific  value  of  x  a 
single  value  is  returned  to  ArCoef  or  CreateAjr. 

The  parameters  inputed  into  bernpoly  are  n  and  x.  Both  n 
and  x  are  described  aboved. 

There  are  no  parameters  outputed  from  bernpoly,  only  the 


single  value  calculated  in  equation  (lp). 


bempoly  needs  to  call  modules  com  and  b.  Module  com 
evaluates  combinations  as  described  earlier  in  this  chapter  and 
in  elementary  probability .  The  bernoulli  numbers  are  calculated 
in  module  b,  and  stoi  '•  m  global  array  b,  onl.,  '.nee  during  the 
running  of  the  program.  The  global  binary  flag  b;  is  set  to 
true  after  the  first  time  u.  poly  calls  module  b.  Bernoulli 
numbers  don’t  depend  on  any  iables  and  the  only  information 
module  b  needs  to  know  is  how  many  numbers  to  calculate,  which 
is  determined  in  the  main  module  thesis. 

The  temporary  internal  variables  used  by  module  bernpoly 
are  k,  sum,  and  power,  k  represents  the  current  term  being 
evaluated,  sum  holds  the  summation  of  the  terms  previously 
calculated,  and  power  holds  the  different  powers  of  x  needed  in 

••  .cat i  * n  (15). 

The  first  step  in  module  bernpoly  in  to  call  module  b  if  it 
wasn't  called  previously.  The  next  step  is  to  initialize 
variables  sum  and  power  to  0.0  and  1.0  respectivly.  The  last 
step  is  to  loop  through  all  n  terms,  evaluating  each  as 
described  in  equation  (15).  By  summing  the  terms  while  the 
program  loops  the  variable  sum  will  contain  the  result  when  the 
loop  finishes. 

bernpoly  is  an  efficient  procedure  to  evaluate  a  bernoulli 
polynomial  for  a  specific  value. 


ArCoef 


The  ArCoef  module  is  a  procedure  whi,  u.ulculates  the  Ar 
coefficients  that  used  by  the  Queue  modu*  The  Ar 

coefficients  are  calculated  by  the  equation 
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(16) 


where 


k  ='  the  kth  coefficient 
p  =  number  of  samples 
1  =  (p  +  1 )/6*p 

b()  =  calls  to  the  bernpoly  module 

Cnee  calculated,  the  coefficients  are  stored  in  the  global  array 
Ar  for  module  Queue  to  use. 

The  parameters  inputed  to  the  ArCoef  nodule  are  r,  p,  and 
1.  r  plus  one  is  the  number  of  coefficients  to  be  evaluated, 
always  starting  with  zero.  A[Qj  is  always  set  to  1.0.  p  and  1 
are  described  above. 

There  are  no  parameters  outputed  from  ArCoef.  The  global 


array  Ar  is  updated. 


bempoly  is  the  only  module  which  ArCoef  needs  to  call. 

The  values  returned  by  the  function  bernpoly  are  used  to 
calculate  the  Ar  coefficients  as  shown  in  equation  (16)  above. 

The  te-porary  internal  variables  used  are  sign,  partial, 
ptor,  k,  rt,  un-1  pi.  sign  is  set  to  either  plus  or  minus  one  as 
determined  by  raising  minus  one  to  the  kth  power,  and  represents 
the  final  sign  of  the  kth  coefficient  being  calculated,  partial 
is  temporary  holding  variable  for  intermediate  results.  The 
integer  k  ranges  from  zero  to  r,  and  repesents  the  current 
coefficient  being  evaluated,  rt  and  pi  are  variables  which  are 
passed  to.  the  module  bernpoly  and  contain  the  values  r  plus  one 
and  p  times  1  respecfully. 

The  first  step  in  ArCoef  is  to  set  the  first  element  in  the 
array  Ar  (Ar[0])  to  one.  To  evaluate  the  next  r  terms  the 
temporary  variables  sign,  ptor,  rt  and  pi  are  evaluated, 
bernpoly  is  called,  and  then  each  coefficient  is  calculated  by 
equation  (16). 

ArCoef  is  an  efficient  procedure  to  calculate  Ar 
coefficients  required  by  the  procedure  Queue.  The  resulting 
coefficients  are  stored  in  the  global  array  Ar. 


The  Queue  module  is  a  procedure  which  calculates  the  Q 
coefficients  that  are  required  by  the  modR  module.  The  Q 
coefficients  are  calculated  by  the  equation 
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where 


r  =  the  term  being  evaluated 
k  =  the  kth  coefficient 
A  =  the  Ar  coefficients 
Q  =  the  Q  coefficients 

Once  calculated,  the  coefficients  are  stored  in  the  global  array 
Q  for  module  modR  to  use. 

The  paramters  inputed  into  the  Queue  module  are  r,  p,  and 
1.  r  plus  one  is  the  number  of  coefficients  to  be  evaluated, 
always  starting  with  zero.  Q[0]  is  always  set  to  1.0.  p  and  1 
aie  passed  to  the  module  ArCoef. 

There  ae  no  parameters  outputed  from  Queue.  The  global 


array  Q  is  updated. 


The  module  ArCoef  is  the  only  module  which  Queue  needs  to 
call.  The  values  calculated  by  module  ArCoef  are  stored  in 
global  array  Ar  and  used  by  module  Queue  as  shown  in  equation 
(17). 

The  temporay  internal  variables  used  by  module  Queue  are  i, 
k,  and  sum.  The  integer  i  ranges  from  zero  to  r  and  represents 
the  current  coefficient  being  evaluated.  The  integer  k,  ranging 
from  one  to  i.  is  used  as  an  internal  loop  counter  for  summing 
the  terms  that  make  up  each  Q  coefficient.  Real  variable  sum 
adds  the  intermediate  terms  together  as  they  are  being 
calculated  in  the  internal  loop. 

The  first  step  of  module  Queue  is  co  call  module  ArCoef  to 
calculate  the  Ar  coefficients  needed.  The  next  step  is  to 
initialize  the  first  Q  coefficient  Q[0]  to  1.0.  Finally,  each 
q  coefficient  is  calculated  as  shown  in  equation  (17)  starting 
with  Q[l]  and  continuing  until  Q[r]. 

The  module  Queue  is  an  efficient  procedure  to  calculate  the 
Q  coefficients  required  by  module  nodR.  The  source  code  for 
module  Queue  is  straight  forward  translation  of  equation  (17). 
The  resulting  Q  coefficients  are  stored  in  global  array  Q. 


CreateA jr 


The  CreateAjr  module  is  a  procedure  which  calculates  the 
Ajr  coefficients  that  are  used  by  the  CreateCjr  module.  The  Ajr 
coefficients  are  calculated  by  the  equation 


r  (  r+1  ) 


where 


j  and  r  =  the  row  and  column  repectivly  of  the 
coefficient  being  evaluated 
a  and  v  =  values  passed  to  CreateAjr  from  CreateCjr 
b()  =  calls  to  the  bernpoly  module. 

Cnee  calculated,  the  coefficients  are  stored  in  the  global  array 
Ajr  for  module  CreateCjr  to  use. 

The  parameters  inputed  to  the  CreateAjr  module  are  j,  r,  a, 
and  v.  Integers  j  and  r  define  the  size  of  the  matrix  of 
’■■..■.efficients  to  be  calculated  for  module  CreateCjr.  The  real 
variables  a  and  v  are  used  as  shown  in  equation  (16)  above. 

There  are  no  parameters  outputed  from  module  CreateAjr. 

The  global  array  Ajr  is  updated. 


bernpoly  is  the  only  module  which  CreateAjr  needs  to  call. 
The  values  returned  by  the  function  bernpoly  are  used  to 
calculate  the  Ajr  coefficients  as  shown  in  equation  (13)  above. 

The  temporary  internal  variables  needed  by  module  CreateAjr 
are  1,  m,  sign,  rt,  and  temp.  The  integers  1  and  m  are  used  as 
loop  counters  to  specify  the  current  coefficient  being 
evaluated.  Integers  sign  and  rt  are  used  to  speed  up  the 
execution  time  and  make  the  source  code  more  readable,  sign 
alternates  between  plus  and  minus  one  and  represents  minus  one 
raised  to  the  r  minus  one  power  in  equation  (18).  rt  always 
contains  the  value  of  the  column  number,  of  the  coefficient 
being  evaluated,  plus  one.  The  real  variable  temp  holds 
intermediate  results  which  can  be  used  by  all  the  coefficients 
:r.  the  same  column,  this  avoids  recalculation  of  intermediate 
res^.ts  which  in  turn  speeds  up  the  execution  time  of  the  module. 

The  first  step  is  to  initialize  sign  to  positive  one.  All 
the  values  in  column  zero  in  array  Ajr  are  also  set  to  positive 
one.  Once  the  variables  are  initialized  the  rest  of  the 
coefficients  are  evaluated,  one  at  a  time,  using  equation  (13). 

CreateAjr  is  an  efficient  procedure  to  calculate  the  Ajr 
coefficients  required  by  procedure  CreateCjr.  The  resulting 
coefficients  are  stored  in  the  global  array  Ajr. 


CreateCjr 


The  CreateCjr  module  is  a  procedure  which  calculates  the 
Cjr  coefficients  that  are  used  by  module  modR.  The  Cjr 
coefficients  are  calculated  by  the  equation 
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where 


j  and  r  =  the  row  and  column  respectivly  of 
the  coefficient  being  evaluated 
A  =  Ajr  coefficients 
C  =  Cjr  coefficients. 

nee  calculated,  the  coefficients  are  stored  in  the  global  array 
Cjr  for  module  modR  to  use. 

The  parameters  inputed  to  the  CreateCjr  are  j,  r,  a,  and  v. 
Integers  j  and  r  define  the  size  of  the  matrix  of  coefficients 
to  be  calculated.  The  real  variables  a  and  v  are  passed  to 
module  CreateAjr  and  are  not  used  directly  by  module  CreateCjr. 

There  are  no  parameters  outputed  from  module  CreateCjr. 

Trie  global  array  Cjr  is  updated. 


CreateAjr  is  the  only  module  which  CreateCjr  needs  to  call. 
The  values  created  by  module  CreateAjr  are  used  by  module 
CreateCjr  as  shown  in  equation  (19)  as  shown  above. 

The  temporary  internal  variables  used  by  Crc  are  1, 

k,  and  sum.  The  integers  1,  m,  and  k  are  all  loop 

counters.  1  and  m  represent  the  column  and  row  number  of  the 
current  coefficient  being  evaluated  and  k  is  used,  as  shown  in 
equation  (19)  above,  to  loop  through  the  intermediate  results 
which  must  be  summed.  The  real  variable  sum  holds  the  summation 
of  the  intermediate  results  for  each  coefficient. 

When  the  module  CreateCjr  is  first  entered  the  module 
CreateAjr  is  called  to  calculate  all  the  required  Ajr 
coefficients.  The  next  step  is  to  initialize  all  the 
aoe: :: slants  in  column  zero  to  positive  one.  The  rest  of  the 
'.oiule  calculates  the  remaining  coefficients  as  described  by 
equation  (19)  one  at  a  time.  The  coefficients  are  calculated  in 
row  major  order  starting  with  zero  and  working  up  to  column  j, 
for  columns  one  through  r. 

hcdule  CreateCjr  is  an  efficient  implementation  of  equation 
(  .9,'»  The  calculated  are  stored  in  global  array  Cjr  for  module 


The  modR  module  is  a  procedure  which  calculates  the  R 
coefficients  for  one  main  module  thesis,  bu*  hich  are  used  by 


the  probability  function  prob.  The  R  coefficients  are 
calculated  by  the  equa*-  'ua 
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where 

r  =  the  rth  term 
Q  =  the  Qr  coefficients 
C  =  the  Cjr  coefficients 

.".ice  calculated  the  R  coefficients  are  stored  in  the  global 
array  R  for  module  prob  to  use. 

The  parameters  inputed  into  the  modR  module  are  j,  r,  a,  v, 
p,  and  1.  The  only  parameter  module  modR  actually  uses  is  r, 
which  is  the  number  of  terms  plus  one  it  must  calculate.  The 
coefficients  always  start  with  R[0]  and  are  calculated 
consecutively  up  to  R[r].  R[0j  is  initialized  to  1.0,  The  rest 
of  the  parameters  are  passed  to  modules  Queue  and  CreateOjr  to 
calculate  the  Q  and  Cjr  coefficients  required  by  modR. 


There  are  no  parameters  outputed  from  module  modR.  The 
global  array  R  is  updated. 

modR  needs  to  call  modules  Queue  and  CreateCjr  to  calculate 
the  Q  and  Cjr  coefficients  respectively.  The  coefficients  are 
used  as  described  in  equation  (20)  above. 

The  temporary  internal  variables  used  by  modR  are  i,  k,  and 
sum.  The  integers  i  and  k  are  used  as  loop  counters.  The 
outside  loop  is  controlled  by  i,  which  also  represents  the 
current  coefficient  being  calculated.  The  inside  loop  counter 
is  k.  k  ranges  from  one  to  i  and  represents  the  intermediate 
terms  in  the  summation  part  of  equation  (20).  The  real  variable 
sum  holds  tne  sum  of  the  intermediate  terms  for  each  coefficient. 

The  first  steps  of  module  modR  is  to  call  modules  Queue 
and  CreateCjr.  The  rest  of  the  module  calculates  the  R 
coefficients  one  at  a  time  by  executing  the  following  steps. 
First,  evaluate  the  inside  summation  of  equation  (20).  Then 
subtract  the  sum  from  the  Q  coefficient.  Finally,  divide  the 
result  by  the  Cjr  coefficient.  The  final  result  is  stored  in 
the  global  array  R  and  the  next  r  coefficient  can  then  be 
calculated. 

Module  modR  is  a  straight  forward  and  efficient 
implementation  of  equation  (20). 


ud 


loggam 


The  module  loggam  is  a  function  which  calculates  the 
natural  logrithm  of  gamma  of  a  specific  point  sent  to  it  by 
either  the  module  prob  or  beta  which  call  it.  The  equation  used 
is  shown  below. 

Lojjam  -  (o Xy  -  (ckj )  *  -  dy  -JU  (d  ier. v*  )  + 
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where 


dp,  dz,  dw,  dv,  du,  dt,  dr,  and  dq  are  constants 
dy  and  dterm  =  variations  of  dx 
ds  =  de  /  (dy  *  dy) 

Once  calculated  the  result  is  returned  to  the  calling  module. 

The  only  parameter  inputed  into  the  function  loggam  is  dx. 
The  real  variable  dx  is  the  point  for  which  the  natural  logrithm 
oi  gamma  is  needed. 

There  are  no  parameters  outputed.  The  only  value  returned 
is  the  result  of  the  function. 
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The  function  loggam  is  a  self-contained  function  and  does 
not  need  to  call  any  other  nodules  to  conplete  it's  calculations. 

All  the  temporary  internl  variables,  except  dy,  dterm,  and 
ds,  are  used  as  constants.  The  constants  are  used  for  clarity 
because  the  constants  contain  many  digits  and  would  make  the 
finial  equation  unreadable.  The  variables  dy,  dterm,  and  ds  are 
used  as  shown  in  equation  (21). 

The  logic  for  the  module  loggam  is  straight  forward.  Once 
the  value  is  determined  to  be  within  a  meaningful  range  equation 
(21)  is  implemented.  After  the  calculation  is  complete  the 
result  is  returned  to  the  calling  module. 

Function  loggam  is  a  fairly  efficient  implementation  of  the 
gamma  function.  Some  efficiency  is  lost  in  assigning  constants 
.0  variables,  but  the  small  amount  lost  was  determined  necessary 
to  increase  the  readability.  The  natural  logrithm  of  the  result 
is  returned  because  it  is  much  easier  to  calculate. 


The  function  beta  calculates  the  beta  distribution  result 


given  the  two  points  from  ibeta  by  using  the  following  equation 
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where 


loggam()  =  calls  to  the  function  loggam 

p,  q  =  points  for  which  the  function  is  evaluated 

Once  calculated  the  result  is  returned  to  ibeta. 

The  only  parameters  inputed  into  beta  are  p  and  q.  The 
real  variables  p  and  q  determine  the  point  of  the  beta 
distribution  required  by  the  nodule  ibeta. 

There  are  no  parameters  outputed.  The  only  value  returned 
is  the  result  of  the  function. 

The  logic  of  the  function  beta  is  contained  in  equation 
{ 22) .  Once  called  by  ibeta,  beta  calls  the  function  loggam 
three  tines,  calculates  the  result  as  shown  in  equation  (22), 


and  returns  the  result  to  ibeta. 


a 


Jj  The  function  ibeta  calculates  the  incomplete  beta  function 

for  modules  prob  and  betap.  The  incomplete  beta  function,  along 
I  with  the  R  coefficients  obtained  by  procedure  modR,  is  the  main 

|  ^  function  of  the  equation  this  thesis  project  is  auh  cited.  The 

equation  (11)  can  be  found  in  the  first  chapter.  The  main 
equation  use  by  module  ibeta  is  shown  below. 
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p,  q,  and  x  are  input  variables 
beta()  -  a  call  to  module  beta 
bint  =  error  constant 

The  three  input  parameters  p,  and  x  are  all  real 
variables.  There  are.no  output  variables.  The  only  value 
the  function  ibeta  returns  to  trie  calling  modules  prob  and  betap 
is  the  result  of  the  function. 


►  The  internal  variable  bint  is  necessary  because  the 

calculation  of  the  incomplete  beta  funcion  is  not  a  fix  ratio 
between  the  input  variables. 


The  function  prob  calculates  the  theoretical  probability 


for  a  given  point  x  for  system  two.  The  main  module  thesis  and 
module  newton  both  need  to  call  prob.  The  equation  which 
represents  module  prob  i-  shown  below. 
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ibeta()  and  loggan( )  are  calls  modules 

R[i]  =  R  coefficients 

a,v,p,l,x,n,  and  limit  are  input  variables 

The  input  variables  a,  v,  p,  1,  x,  and  n  are  real  and  the 
variable  limit  is  an  integer.  There  are  no  output  variables, 
only  the  result  of  the  function  is  returned. 

The  internal  integer  variables  i  and  k  are  used  as  loop 
counters  for  the  summations  rc.u’red  by  equation  (24).  The 
other  internal  variables,  reals  tsum  and  t2sum,  are  used  as 
temporary  holding  locations  for  the  intermediate  sums. 

Function  prob's  logic  is  depicted  by  equation  {2-0. 


betap 


The  function  betap  calculates  the  inverse  beta  function. 
Function  newton  uses  the  result  from  function  betap  as  an 
initial  starting  point  for  it's  calculations.  For  further 
information  about  function  newton  look  at  the  next  section. 

The  parameters  alpha,  p  and  q  are  inputed  into  the  module 
betap.  With  these  starting  parameters  the  module  betap 
calculates  the  inverse  beta  function  and  returns  the  value  to 
the  calling  module  newton.  No  parameters  are  returned  from 
betap,  only  the  value  of  the  function  is  outputed. 

The  only  module  betap  needs  to  call  is  the  module  ibeta. 

There  are  many  internal  variables  but  most  are  used  as  loop 
counters  or  constants.  The  integers  i  and  j  are  used  as  loop 
counters.  The  real  variables  are  used  as  constants. 


newton 

The  function  newton  is  called  by  the  main  module  thesis  to 
calculate  the  theoretical  value  system  two  was  designed  for. 
Newton's  iterative  method  combined  with  the  secant  algorithm 
dictated  the  internal  logic  of  the  module  newton. 
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The  variables  a,  v,  p,  1,  x,  n  and  limit  are  all  inputed 
into  the  module  newton.  All  the  variables  are  used  as  arguments 


to  the  probability  function  prob.  There  are  no  parameter:, 
outputod  from  module  newton,  but  the  result  of  the  function  : c 
returned  to  the  calling  module  thes-L.-.  Besides  the  module  prob, 
module  newton  also  calls  module  betap  to  get  the  starting  value 
of  it's  process. 


The  internal  logic  can  be  described  by  the  following  steps 
given  the  equation  f(x)  =  0  and  the  initial  approximations  F0 
and  P  ,  with  i  . 


Step  1  Set  i  =  2. 

-:,r  2  Set  PL  -  P-, 


t  '  Pi-x  ) 

f  in-  )  - 


(25) 


Step  3  Determine  if  the  procedure  should  be  continued. 
If  so,  goto  Step  4.  If  not,  goto  Step5- 


Step  4  Add  1  to  i  and  goto  Step  2. 


Step  5  The  procedure  is  complete. 


A  simple  linear  approach  would  not  have  worked  because  of 
the  behavior  of  the  probability  function.  The  above  steps  were 
implemented  in  a  straight  forward  method. 


thesis 


The  main  module  thesis  contains  the  driver,  which 
supervises  the  entire  syst^-.,  and  subsystem  three,  whi  h  makes 
the  ^'oal  analysis.  The  module  thesis  is  invoked  when  the  user 
starts  the  system.  The  five  mouui.*s  getreal,  lmodule,  modR, 
prob,  and  newton  are  called  by  module  thesis  and  will  be 
described  in  the  following  paragraphs  when  the  internal  logic  of 
procedure  thesis  is  discussed. 


I  S' 


The  first  step,  for  procedure  thesis,  is  ask  the  user  for 
the  number  of  samples,  sample  size,  and  the  file  name  where  the 
data  points  are  stored  if  batch  processing  is  used.  The  module 
getreal  is  used  to  interface  between  the  user  and  module  thesis. 


The  next  step  is  to  call  procedure  lmodule  to  calculate  the 
1 -value.  This  represents  the  result  of  system  one. 


Before  module  modR  is  called  to  calculate  the  R 
coefficients,  a  rough  estimate  of  the  number  of  terms  that  will 
be  required  by  the  module  prob  is  made.  The  estimate  is  made 
because  the  time  to  calculate  R  coefficients  increases  rapidly 
v.  .th  the  number  of  terns  needed. 


After  procedure  modR  is  callei,  successive  calls  to  the 
’.oiuie  crob  is  made.  Each  time  tre  call  is  made  the  number  ; 


terms  is  increased.  With  x  set  at  one  the  results  from  prob 
will  quickly  approach  one.  After  the  functin  prob  approaches 
one  with  a  predetermined  tolerance,  the  rest  of  the  terms  are 
insignificant.  This  step  limits  the  number  of  terms  from 
infinity  in  equation  (11)  to  an  amount  that  can  be  resonably 
calculated  without  much  lost  in  accuracy. 

Next  the  user  must  enter  the  alpha  level  to  be  tested  for. 
The  module  getreal  is  again  used  as  an  interface  between  the 
user  and  the  program.  The  rest  of  the  steps  are  contained  in  a 
loop  controlled  by  the  value  of  alpha.  When  the  user  enters  an 
alpha  level  of  zero  the  loop  and  the  program  will  end,  otherwise 
the  user  can  test  as  many  alpha  levels  as  desired. 

The  last  module  that  is  called  is  module  newton.  Function 
r.  ;'vton  returns  the  theoretical  result  that  system  two  was 
designed  for.  Now  with  the  results  from  systems  one  and  two, 
system  three  is  executed. 

In  the  final  step,  system  three  compares  the  results  from 
one  and  two  and  reports  to  the  user  it's  conclusion.  If  the 
result  from  system  two  is  larger  than  from  one  the  hypothesis 
that  all  the  data  comes  from  the  same  sample  can  be  rejected, 
otherwise  a  definite  conclusion  can  not  be  made. 


IV.  Testing  and  Verification 


This  chapter  will  discuss  the  methods  used  to  test  and 
verify  the  procedures  written  for  the  modules  described  in 
chapter  three.  First,  the  terms  boundry  value  testing,  path 
testing,  and  drivers  will  be  described.  Next,  individual 
testing  of  some  modules  will  be  discussed.  Finally,  the 
original  requirements  from  chapter  one  will  be  re-examined  to 
come  up  with  the  test  data  required  to  show  whether  the  system 
developed  does  what  it  was  intended  to  do. 


resting  Methods 

laundry  value  testing,  as  used  in  this  thesis,  is  setting 
each  variable  to  it's  extreme  value  and  any  special  cases  and 
making  sure  the  progam  still  works  as  expected.  For  example,  a 
positive  variable  that  has  a  highest  value  of  thirty  should  be 
tested  for  zero,  one,  twenty-nine,  thirty,  and  thirty-one. 

Path  testing,  as  used  in  this  thesis,  is  setting  the  input 
variables  to  different  values  to  ensure  each  line  of  computer 
code  is  executed  at  least  once.  This  is  important  not  only  to 
verify  each  statement  works  correctly,  but  also  so  as  to  not 


executed  because  of  bad  logic.  An  example  of  path  testing  is 
when  the  program  has  an  if  test  raking  sure  the  then  and  else 
part  are  both  executed. 


Drivers  are  modules  that  are  essentiall  empty,  their 
purpose  is  to  call  lower  modules  in  the  structure  charts  for 
testing.  Also  found  in  drivers  may  be  assignment  statements  for 
the  variables  being  passed  to  the  lower  modules  and  print 
statements  for  printing  the  results  of  the  subroutine  calls. 

The  results  of  the  subroutine  calls  are  then  checked  against  the 
expected  results  to  verify  that  the  subroutine  is  wroking  as 
expected..  The  use  o r  drivers  allows  the  lower  modules  to  be 
tested  completely  before  the  higher  modules  are  written.  This 
is  an  important  concept  because,  for  example,  if  a  program  has 
;'ive  levels  of  modules  and  a  result  returned  from  the  highest 
level  is  wrong  the  error  would  probably  be  in  the  highest  level 
if  drivers  were  used  in  previous  testing,  otherwise  the  problem 
could  be  in  any  of  the  modules  in  lower  levels. 


While  each  module  is  tested  for  all  the  necessay  inputs 
to  fully  test  the  module  a  test  plan  is  also  written.  The  test 
plan  is  a  check  list  of  all  the  inputs  and  appears  in  tabular 
form.  Each  item  can  then  be  checked  systematically.  A  copy  of 
the  test  plan  used  for  this  project  appears  in  Appendix  B.  When 
developing  a  test  plan  it  is  important  to  remember  that  some 
numbers  can  test  many  cases  and  by  ticking  the  test  numbers 


carefully  the  number  of  tests  needed  to  be  run  can  be  greatly 
reduced,  .'/ext,  the  testing  method  of  the  individual  modules 
will  be  looked  at. 

getreal 

The  main  purpose  for  getreal  is  to  read  numbers  into  the 
program.  The  numbers  can  come  from  a  data  file  or  be  entered  in 
interactively .  All  the  numbers  selected  for  test  cases  shown  in 
Appendix  3  were  tested  both  from  a  file  and  interactively.  The 
program  only  recognizes  the  first  six  characters  of  the  file 
:ir  me ,  therefore  a  short  file  name  and  one  that  is  more  than  six 
characters  long  should  be  tested.  The  module  getreal  should 
: crept  integers  and  reals.  The  reals  may  or  may  not  have  digits 
before  the  decimal  points.  The  module  should  give  a  bad  data 
message  for  any  characters  besides  digits  and  decimal  points.  A 
number  having  two  decimal  points  is  also  not  accepted. 

Imodule 

mne  number  of  data  points  needed  by  module  lmocule  can  be 
d-icermined  by  multiplying  the  number  of  samples  times  the  sample 
size.  If  there  is  not  enough  data  points  given,  when  running 
batch,  one  program  will  abend  ar.u  if  there  are  too  many  one 


program  will  ignore  the  additional  data  points.  The  data  points 
inputed  into  the  module  lmodule  were  both  real  and  integer.  A 
zero  was  also  included  in  the  data  points.  The  case  where  the 
sample  size  was  larger  than  the  number  of  samples  and  also  the 
reverse  case  was  tested. 


com 


The  module  com  calculates  number  of  possible  combinations 
given  two  numbers.  There  two  inputs  n  and  i  were  set  to  various 
numbers  t.o  test  all  cases.  There  are  three  cases  to  test  for. 
The  first  is  when  i  equals  zero.  The  next  case  is  when  n  and  i 
are  equal.  The  last  case  covers  the  remaining  combinations  of 
.umbers,  './hen  i  equals  zero  or  n  the  result  is  one. 

b 


Module  b  calculates  bernoulli  numbers.  The  numbers  are 
constant  and  only  depend  on  the  previous  numbers  calculated, 
therefore  only  a  few  were  picked  out  to  test.  Two  of  the 
r umbers  picked  were  zero  and  one  to  test  the  lower  boundry. 
.'.umbers  three,  five,  and  nine  were  also  picked  because  they 
e  .juste  to  zero.  The  last  two  cases  needed  were  a  positive  and 
negative  result. 
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System  Two 


The  rest  of  tin'  testing  of  the  modules  was  similiar  to  the 
ones  explained  above.  The  module  modR  and  t h-.-  nodules  modR 
needs  for  it's  calculators  are  just  implementations  of 
equations.  The  test  was  accomplished  by  hand  calculating  some 
numbers  and  then  checking  the  results.  Appendix  B  shows  some  of 
the  numbers  tested.  The  results  returned  by  the  module  newton 
were  compared  against  the  results  obtained  by  Dr.  Nagarsenker  in 
1930. 


t  h  a  s  i  s 


The  module  thesis,  which  represents  the  entire  system,  was 
tested  for  two  main  cases.  The  first  is  does  it  work  for  known 
data.  The  other  does  it  meet  all  the  requirements  stated  in 
the  first  chapter  and  expaned  throughout  the  thesis  effort.  The 
requirements  which  were  tested  for  are  explained  in  the  rest  of 
this  chapter. 

Re  .virement s 

One  the  first  requirement  is 

The  system  should  be 


that  the  system  be 
able  to  handle  the 


interactive . 


user 


entering  the  data  points  interactiviy  and  from  a  previously 
created  data  file.  An  other  feature,  added  for  the  user's 
convenience,  is  that  the  user  can  test  the  data  for  multiple 
alpha  levels  without  waiting  for  the  recalculation  of 
coefficients  that  do  not  change.  The  last  feature  is  that  when 
a  user  enters  a  bad  character  when  entering  interactive  data  the 
user  should  be  able  to  re-enter  just  that  one  point  and  not  have 
to  restart  from  the  beginning. 


The  test  cases  illustrated  in  Appendix  B  for  the  module 
thesis  was  enter  interactively  and  from  a  file  to  test  all  the 
requirements.  Most  of  the  system  is  a  black  box  to  the  user 
■\;ioh  simplifies  the  testing  of  the  requirements. 


V.  Conclusion 


This  chapter  will  first  re-examine  the  overall  requirements 
of  this  thesis  and  discuss  whether  or  not  they  were  met.  Then 
possible  problems,  along  with  suggestions  on  how  to  resolve 
them,  will  be  discussed.  Finally,  as  a  way  of  concluding, 
future  expansion  of  this  thesis  will  be  looked  at. 

Requirements 

The  overall  thesis  was  broken  into  three  subsystems.  The 
overall  project  was  to  create  an  interactive  statistical  package 
to  examine  failure  rates  assuming  an  exponential  distribution. 
■.System  one  calculated  the  1- value.  System  two  calculated  the 
theoretical  value.  System  three  compared  the  results  from 
systems  one  and  two  and  reported  it’s  conclusion  back  to  the 
user. 


Each  of  the  subsystems  worked  correctly  for  most  of  the 
test  cases.  Some  of  the  minor  problems  will  be  discussed  later. 

st  data  from  some  well  known  results  only  proved  that  the 
requirements  were  more  than  met. 

A  faculty  member,  which  helped  to  test  the  user 
friendliness  of  tne  system,  found  it  easy  to  use  after  taxing  a 
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few  minutes  to  read  the  user  manual  in  Appendix  C.  The  response 
time  was  found  to  be  reasonable  for  interactive  work.  The 
ability  to  change  the  alpha  level,  without  having  to  preform 
most  of  the  time  consumming  calculations  over,  greatly  added  to 
the  usefullness  of  the  entire  system. 

Problems  and  Suggestions 

There  were  two  basic  problems  founds  but,  neither  were 
prohibitive.  The  first  was  if  the  user  decided  to  store  the 
data  points  in  a  data  file  and  the  user  misspelled  the  file  name 
when  responding  to  the  program's  prompt  the  program  would  abend 
with  a  file  not  found  message.  This  was  not  found  to  be  much  of 
a  problem  because  the  file  name  was  one  of  the  first  questions 
asked  of  the  user,  so  not  much  processing  time  was  lost.  The 
solution  is  to  re-execute  the  program. 

I 

The  second  problem  is  if  the  sample  size  isn’t  more  than 
two  greater  than  the  number  of  samples  the  program  will  not  work 
correctly.  This  is  because  Pascal  only  has  single  precision 
and  when  both  number  are  close  the  round  off  error  is  too  great. 
This  isn't  a  problem  if  the  user  has  sufficient  data  for  each 
sample.  The  two  possible  solutions  are  to  take  a  couple  of 
samples  at  a  time  or  the  better  solution  is  to  recode  the 
project  in  a  language  which  allows  double  precision  such  as  3. 


Future  Expansion 


This  thesis  only  handles  a  small,  but  useful,  set  of 
possible  distributions.  The  same  procedure  can  be  used  to 
automate  many  of  the  others.  This  thesis  can  act  as  a  starting 
point  for  a  large  statical  package  while  also  being  useful  until 
the  rest  get  completed. 


-V3-SJ  On  O 


***** 


NAME i  get real 

FUNCTION i  reads  in  real  numbers  from  a  specified  file 

INPUTS*  filename 

OUTPUTS*  num,  flag 

GLOBAL  VARIABLES  USED*  none 

GLOBAL  VARIABLES  CHANGED*  none 

GLOBAL  TABLES  USED*  none 

GLOBAL  TABLES  CHANGED*  none 

MODULES  CALLED*  none 

CALLING  MODULES*  thesis,  lmodule 


* 

* 

* 

* 

* 

* 

* 

* 

# 

* 

» 

* 


**************************************************************** 


procedure  getreal(var  filename  *  text ; 

var  num  *  real*  var  flag  »  boolean); 
var  ch  *  char;  i  *  real;  decimal  *boolean; 
begin 

num  t-  0.0;  flag  *=  false;  decimal  *=  false; 
if  eoln(filename ) 

then  readln( filename ) ; 
read (filename , ch) ; 

while  (not  eoln( filename ) )  and  (ch  =  '  ' )  do 
read( filename, ch) ; 
if  ( ch>=  'O' )  and  (ch  <=  *9’ ) 
then  num  *=  ord(ch)  -  ord('O') 
else  if  ch  =  '  .  ' 

then  decimal  *=  true 
else  flag  *=  true; 

while  (not  eoln( filename ) )  and  (not  flag)  and  (ch  <>  '  ') 
and  (not  decimal)  do 
begin 

read ( filename , ch ) ; 

if  (ch>=  '0')  and  (ch  (  -  '9') 

then  num  *=  10  #  num  +  (ord(ch)-ord( ’ O' ) ) 
else  if  ch  =  '  .  ' 

then  decimal  *=  true 
else  if  ch  <>  '  ' 
then  falg  *=  true 

end ; 

i  «=  10; 

while  (not  eoln)and(not  flag)and(ch  O  '  ’ )and(decimal)  do 
begin  read(filename.ch) ; 

if  (ch  )=  '0')  and  (ch  (=  '9') 

then  num  *=  num  +  (ord(ch)-ord( ' O' ))/i 
else  if  ch  <  >  '  ' 
then  flag  *=  true; 
i  i  =  i  *  1 0 
end ; 
if  flag 

then  while  not  eoln( filename )  do  read (filename, ch) 

end ; 
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*  NAME »  lmodule  * 

*  FUNCTION j  calculates  1  value  * 

*  INPUTSi  filename,  s,  n  * 

*  OUTPUTS!  lmodule  * 

*  GLOBAL  VARIABLES  USEDi  none  * 

*  GLOBAL  VARIABLES  CHANGED!  none  * 

*  GLOBAL  TABLES  USED!  none  * 

*  GLOBAL  TABLES  CHANGED*  none  * 

*  MODULES  CALLED!  getreal  * 

*  CALLING  MODULES!  thesis  * 

«  * 

***»*#******#**#*•*»**»•*#»»#*#»*»#**•**»********#*»#**»*****«** 


function  lmodule (filename  !  string!  s,  n  i  real)ireal; 
var  sum,  add,  mult,  i,  j,  k  i  realj 

flag  i  boolean!  num,  count  i  integers 
begin 

flag  i=  false! 

sum  i =  O.Oj  add  i=  O.Oj  mult  i=  l.Oj 
i  :=  0.0}  j  i=  0.0}  k  i=  O.Q| 
while  (  j  (  n)  and  (not  flag)  do 
begin 

while  (i  <  s)  and  (not  flag)  do 
begin 

if  filename  =  'input* 
then  begin 

num  i=  trunc(i)} 
write (num+1 , '  .  ' ) s 
getreal( input , k, flag) j 
writeln 
end 
else 

getreal(infile,k,flag) s 
sum  !=  sum  +  k} 
if  flag 

then  begin 

writeln('bad  data')} 
if  filename  =  'input' 
then  flag  !=  false 
end 

else  i  :=  i  +  1.0 
end ! 

sum  i=  sum  /  sj  add  i=  add  +  sumj 
mult  i=  mult  *  sumj  sum  i=  O.Oj 
i  !=  0.0}  j  i=  j  +  1 
end  i 

add  i=  add  /ns  j  i=  add i 
num  i=  trunc(n)  -  1} 

for  count  :=  1  to  num  do  add  i=  add  *  jj 
lmodule  ! =  mult  /  add 


*****************#**♦#*###**##*###**#*#*#**#**##**#***#**#####** 


*********«*«***«*«*«***«**#***«###*«**#*+*#*«***#»»***»»»******* 


* 

» 

* 

NAME:  b 

* 

* 

FUNCTION:  calculates  bernoulli  numbers 

# 

* 

INPUTS:  n 

* 

* 

OUTPUTS:  none 

* 

* 

GLOBAL  VARIABLES 

USED:  none 

* 

* 

GLOBAL  VARIABLES 

CHANG-1-  bflag 

* 

w 

GLOBAL  TABLES  USED:  none 

* 

* 

GLOBAL  TABLES  CHANGED:  b^ay 

* 

* 

MODULES  CALLED: 

com 

* 

* 

* 

CALLING  MODULES: 

bernpoly 

* 

* 

****■#»**#********#**#****#*****♦******#**###*###*##♦#♦#»*#•**#*** 

procedure  b(n  :  integer)} 
var  sum  :  real; 

i,  j  :  integer? 
begin 

bflag  i=  true} 
barray[0]  :  =  l.Oj 
if  (n  <  1)  or  (n  >  30) 
then  n  t=  Oj 
for  j  1=  1  to  n  do 
begin 

sum  :  =  0.0} 

for  i  1=  0  to  (j-1)  do 

sum  1-  sum  +  com(i+l,i)  *  barray[i ] 1 
barray[ j ]  :  =  ( ( -  1 )/( j  + 1 ) )  *  sum 
end  j 

for  j  i=  1  to  n  do 

if  (barray[  j  ]  <■  0.0000001)  and  (barray[j]  >  -0.0000001) 
then  barray[j]  :=  0.0 

end } 
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* 

* 


* 

NAME t  bernpoly  * 


* 

FUNCTION;  evaluates  bernoulli  poloynomials 

* 

* 

INPUTS; 

n,  x 

* 

* 

OUTPUTS;  ber-  "My 

* 

* 

GLOBAL 

YARI AEVM  USED;  bflag,  num 

* 

* 

GLOBAL 

VARIA^„l,..  CHANGED;  none 

* 

* 

GLOBAL 

TABLES  v  'HD;  barray 

* 

» 

GLOBAL 

TABLES  cm;  JED;  none 

* 

-a 

* 

MODULES 

CALLED;  b,  com 

* 

* 

CALLING 

MODULES;  Arcoef,  createAjr 

* 

* 

*************************************************************** 


function  bernpoly(n  ;  integer;  x  s  real)  ;  real; 
var  k  i  integer; 
sum  i  real; 
power  i  real; 

begin 

sum  i =  0.0; 
power  ;  =  1.0; 
if  (not  bflag) 
then  b(num); 
for  k  ;=  n  downto  0  do 
begin 

sum  i=  sum  +  com(n,k)  *  barray[k]  *  power; 
power  ;  =  power  *  x 


NAMEs  ArCoef 

FUNCTIONS  calculates  Ar  coefficients  for  module  Queue 
INPUTS s  r,  p,  1 
OUTPUTS s  none 

GLOBAL  VARIABLES  USEDs  none 
GLOBAL  V.-*  TABLES  CHANGED s  none 
GLOBAL  TARLFS  USEDs  none 
GLOBAL  TABLES  CHANGED s  Ar 
MODULES  CALLED:  bernpoly 
CALLING  MODULES s  Queue 


procedure  ArCoef (r  s  integers  p,l  s  real): 
var  sign  s  integers 
parti-  1  s  reals 
ptor  s  reals 
k  s  integer; 
pi  s  reals 
begin 

ptor  t -  1.0; 
pi  s=-  p  *  Is 
Ar[0]  s  =  1.0; 
sign  s  =  Is 
for  k  $=  1  to  r  do 
begin 

ptor  s=  ptor  *  p; 

sign  s  =  sign  *  - 1  s 

partial  :=  (bernpoly(k+l,pl)- 

p*ptor*bernpoly ( k+ 1 , 1) )/ptor 
Ar[k]  s=  (sign/(k*(k+l ) ) }*partial 


NAME i  Queue 

FUNCTION:  creates  Q  coefficients  for  module  modR 
INPUTS:  r,  p,  1 
OUTPUTS:  none 

GLOBAL  VARIABLES  USED:  none 
GLOBAL  VARIABLES  CHANGED:  none 
GLO.-AL  TABLES  USED:  Ar 
GLOBAL  TABLES  CHANGED:  Q 
MODULES  CALLED:  ArCoef 
CALLING  MODULES:  modR 


*#**•***•**•*•*■*■**•*•*■* 


procedure  Queue(r  :  integer;  p,l  :  real); 
var  i,k  :  integer; 

sum  :  real; 
begin 

ArCoef (r, p, 1)  ; 

Q [ 0 ]  :=  1.0; 
for  i  :=  1  to  r  do 
begin 

sum  :=  0.0; 

for  k  :=  1  to  i  do 

sum  :=  sum  +  k  *  Ar[k]  *  Q[i-kj 


*  NAME:  createAjr  * 

*  FUNCTIONj  creates  Ajr  coefficients  for  createCjr  * 

*  INPUTS j  j,  r,  a,  v  * 

*  OUTPUTS «  none  * 

*  GLOBAL  VARIABLES  USEDi  n«ne  * 

*  GLOBAL  VARIABLES  CHANGED.-  none  * 

*  GLOBAL  TABLES  USEDt  none  * 

*  GLOBAL  TABLES  CHANGED j  Ajr  * 

*  MODULES  CALLED i  bernpoly  * 

*  CALLING  MODULES t  createCjr  * 

*  * 
tt***********************************-********-*********-**-********* 


procedure  createA jr( j , r  t  integers  a,v  «  real)} 
var  l,m  s  integer; 

sign,  rt  :  integers 
temp  :  real; 
begin 

sign  »=  Is 
for  m  i=  0  to  j  do 
Ajr[m,0]  i =  1.0; 
for  1  i=  1  to  r  do 
begin 

rt  j  =  I  +  Is 

temp  :=  bernpoly( rt , a) s 

for  m  :=  0  to  j  do 

AjrCm.l]  i=  (sign/( (1)*(1+1 ) ) )* 

( temp-bernpoly ( rt , a+v+m) ) s 


# 

* 

* 

* 

* 

* 

* 

* 

* 

* 

* 

* 


NAME*  createCjr 

FUNCTION*  creates  Cjr  coefficients  for  modR 
INPUTS*  j,  rt  a,  v 
OUTPUTS*  none 

GLOBAL  VARIABLES  USED*  none 
GLOBAL  VARIABLES  CHANGED*  none 
GLOBAL  TABLES  USED*  Ajr 
GLOBAL  TABLES  CHANGED*  Cjr 
MODULES  CALLED*  createAjr 
CALLING  MODULES*  modR 


» 

* 

* 

* 

* 

* 

* 

* 

* 

* 

* 

» 


**  ************************************************************** 


procedure  createC jr( j , r  :  integer;  a,v  i  real); 
var  l,m  s  integer; 
k  i  integer; 
sum  j  real; 
begin 

createA jr( j , r,a, v) ; 
for  m  1=  0  to  j  do 
Cjr[m,0]  i=  1.0; 
for  1‘  i=  1  to  r  do 
for  m  j  =  0  to  j  do 
begin 

sum  t=  0.0; 

for  k  i=  1  to  1  do 

sum  i=  sum  +  k  *  Ajr[m,k]  *  Cjr[m,l-kJ; 
C jr[m, 1]  ; =  sum  /  1 


**************************************************************** 


* 

* 

* 

* 

* 

* 

* 

* 

# 

* 


coefficients 
P.  1 


NAME i  modR 
FUNCTION!  creates  r 
INPUTSi  j,  r,  a,  v, 

OUTPUTS!  none 
GLOBAL  VARIABLES  USEDi  none 
GLOBAL  VARIABLES  CHANGEDi  none 
GLOBAL  TABLES  USEDi  Q,  Cjr 
GLOBAL  TABLES  CHANGEDi  R 
MODULES  CALLED!  Queue,  createCjr 
CALLING  MODULES!  thesis 


for  function  prob 


* 

* 

* 

» 

# 

* 

* 

* 

* 

* 

* 


*  # 
**************************************************************** 


procedure  modR(j,r  i  integers  a,v,p,l  i  real); 
var  sum  !  real; 

i,k  «  integer; 
begin 

createCjr(j,r,a,v) ; 

Queue(r, p, 1) ; 

R[0]  s=  1.0; 
for  i  i=  1  to  r  do 
begin 

sum  i=  0.0; 

for  k  i=  1  to  i  do 

sum  i=  sum  +  r[i-k]  *  Cjr[i-k,k]j 
R[i]  i=  (Q[i]  -  sum)  /  Cjr[i,0] 
end 

end ; 


77 


***************************************************************** 
*  * 

*  NAME i  loggam  * 

*  FUNCTION!  calculates  the  logrithm  of  gamma  of  n  * 

*  INPUTS!  dx  * 

*  OUTPUTS!  loggam  * 

*  GLOBAL  VARIABLES  USEDi  none  * 

*  GLOBAL  VARIABLES  CHANGED i  none  * 

*  GLOBAL  TABLES  USEDi  none  * 

*  GLOBAL  TABLES  CHANGED i  none  * 

*  MODULES  CALLED!  none  * 

*  CALLING  MODULES!  beta,  prob  * 


function  loggam(dx  !  real)! reals 

var  rdo, dy, dterm, de,da,db, domeg, dlggm  i  reals 
ds,dz,dw,dv,du,dt,dr,dq,dp  !  reals 
begin 

rdo  i=  O.Oj  dy  1=  dxs 
dterm  i=  1.0;  de  :=  1.0; 


rdo  i=  O.Oj  dy  s=  dx; 
dterm  i=  1.0;  de  :=  1.0; 
domeg  i=  1.0e25s 

da  i=  0.999999999s  db  s=  1.000000001; 
dlggm  i=  domeg; 
if  (dx  >  =  rdo; 
then  begin 

dlggm  i=  rdo; 
if  ((dx<=da)or(dx)=db) ) 
then  begin 

if  ( (dx<=(da+de) )or(dx>=db+de) ) ) 
then  begin 

while  ( (dy-18.00)<=0.0)  do 
begin 

dterm  i=  dterm  *  dy; 
dy  i =  dy  +  de 
end ; 

ds  i=  de  /  (dy  *  dy) ; 
dz  i=  0. 00 5^ 1025641 02564 1 0 ; 
dw  !=  -0.001917526917526918; 
dv  s  —  0.0008417508417518418; 
du  s=  -0.0005952380952360952; 
dt  i —  0.0007936507936507937s 
dr  !=  -0.002777777777777778s 
dq  !=  0.08333333333333333s 

dp  ; —  0,9189385332046727s 

dlggm  i=  (dy-0.5)*  ln(dy)+dp-dy-ln(dterm) ; 
dlggm  :=  dlggm  +  ( ( ( ( ( (dz*ds+dw)*is+dv)*ds+du) 
ds+dt )  *ds+dr )  *ds+dq  )/'dy 

end 

end 

end ; 

loggam  s=  dlggm 


* 

» 


* 

« 


NAME*  beta 

FUNCTION i  calculates  the  beta  of  p  and  q 
INPUTS i  p,  q 
OUTPUTS*  beta 

GLOBAL  VARIABLES  USED*  none 
GLOBAL  VARIABLES  CHANGED*  none 
GLOBAL  TABLES  USED*  none 
GLOBAL  TABLES  CHANGED*  none 
MODULES  CALLED*  loggam 
CALLING  MODULES*  ibeta 
*  * 

******************************************************************** 


function  beta(p,q  *  real)*realj 
begin 

beta  »=  exp( loggam(p)  +  loggam(q)-loggam(p+q) ) 
end  s 


****************************************************************** 


*  * 

*  NAME i  ibeta  * 

*  FUNCTION!  calculates  the  incomplete  beta  function  * 

*  INPUTS,  p,  q,  x  * 

*  OUTPUTS*  ibeta  * 


*  GLOBAL  VARIABLES  USEDi  none  * 

*  GLOBAL  VARIABLES  CHANGED!  none  * 

*  GLOBAL  TABLES  USEDi  none  * 

*  GLOBAL  TABLES  CHANGED!  none  * 

*  MODULES  CALLED!  beta  * 

*  CALLING  MODULES!  prob,  be tap  * 

*  * 

#****#»*#**#****»**#*****♦*****#*****#*****»*******«»********»** 


function  ibeta(p,q,x  i  real)ireal; 

var  bint ,pp, qq, dp, xx, rxx, t, d, b, s, tempi , temp2  !  real; 

ick, iq, ip, iqm, i  i  integer} 
begin 

bint  !=  0.0; 

if  ( (x=0. 0)or(x= 1 . 0) )  then  bint  i =  x 
else  begin 

PP  i=  PJ 

qq  i=  qj 

ick  i=  Oj 

XX  1=  XJ 

iq  i=  trunc(q)i 

dq  i=  iq  5 

ip  ‘ -  trunc(p)} 

dp  i=  ip; 

if  ( (dq=q)or(dp=p) ) 
then  begin 
if  (dp  =  p) 


then  begin 

iq  !  = 

ip; 

PP  *  = 

qs 

qq  i  = 

p» 

ick  :  = 

1 ; 

XX  !  = 

1.0 

end ; 

1.0; 

rxx  l =  xx/ ( 1 . O-xx) ; 
bint  ; =  t ; 
iqm  i=  iq  -  1; 
if  (iqm  <  >  0) 

then  for  i  i=  1  to  iqm  do 
begin 
d  i  =  i ; 

t  i=  t*rxx*(qq-d)/(?p+d ) ; 
bint  s  =  bint  +  t 

find 


else  begin 

if  (xx>(pp/(pp+qq) ) ) 
then  begin 
PP  «=  qs 
qq  »  =  pi 
xx  s  =  1.0  -  x; 
ick  :=  1 
end  i 

bint  :=  1.0; 
t  «=  1.0; 

iq  i=  trunc(qq+( 1 . 0-xx) * ( pp+qq) ) ; 
s  i  =  iq; 
if  (iq  =  0) 
then  begin 

t  t  =  1.0  -  xx ; 
bint  «  =  t 
end 

else  begin 
if  (iq<>l) 
then  begin 

rxx  i=  xx/(1.0-xx)j 
iqm  i =  iq  -  1 ; 
for  i  i=  1  to  iqm  do 
begin 
d  i  -  i; 

t  i=  t  *  rxx  *  (qq-d)/ (pp+d) ; 

bint  :=  bint  +  t 

end 

end ; 

t  s=  t  *  xx  *  (qq  -  s)/(pp+s); 
bint  :=  bint  +  t 

end; 

i  i=l; 

while ( (iflQl)and 

(abs(t/bint)  (0.0000000000000000000000001)))  do 
begin 
d  ;  =  i ; 

t  :=  t  *  (pp+qq+d-1 .0)/(pp+3+d)*xx; 
bint  ;=  bint  +  t; 
i  « =  i  +  1 
end 


^  f 

b  :=  beta(p.q); 

tempi  i=  exp(pp*ln(xx) ) ; 

temp 2  j=  exp( ( qq- 1 . 0) *ln( 1 . J-xx) ) ; 

bint  :=  temp  1 *temp2/( b*pp )* bint ; 

if  (ick  =  1) 

then  bint  i=  1.0  -  bint 
end ; 

ibeta  « =  bint 


**************************************************************** 


*  NAME i  prob  * 

*  FUNCTION:  calculates  the  distribution  function  of  1  * 

*  INPUTS:  a, v, p, 1, x, n, limit  * 

*  OUTPUTS:  prob  * 

*  GLOBAL  VARIABLES  USED:  none  * 

*  GLOBAL  VARIABLES  CHANGED:  none  * 

*  GLOBAL  TABLES  USED:  R  * 

*  GLOBAL  TABLES  CHANGED:  none  * 

*  MODULES  CALLED:  loggam,  ibeta  * 

*  CALLING  MODULES:  thesis,  newton  * 

*  * 

******************************* ****************** ********* ****** 


function  prob (a, v, p, 1, x,n: real j limit : integer) : real ; 
var  i,j,k  :  integer} 
tsum,t2sum  :  real; 
begin 

t2sum  :=  0.0; 
tsum  :=  1.0; 
j  : =  trunc(p-l )  ; 
for  k  i=  1  to  j  do 

tsum  :=  tsum  *  exp( loggam (n+k/p)-loggam(n) )  ; 
for  i  :=  0  to  limit  do 

t2sum  :=  t2sum  +  R[ i ]*exp( loggam(n- 1+a) -loggam(n- 1+a+v+i ) ) 
ibeta (n-l+a, v+i,x) ; 
orob  :=  tsum  *  t2sum 


•**»***»*#**tt*****#*****»*»************»************************ 
»  * 


*  NAME*  be tap  * 

*  FUNCTION:  calculates  the  inverse  beta  function  * 

*  INPUTS:  alpha,  p.q  * 

*  OUTPUTS:  betap  * 

*  GLOBAL  VARIABLES  USED:  none  * 

*  GLOBAL  VARIABLES  CHANGED:  none  * 

*  GLOBAL  TABLES  USED:  none  * 

*  GLOBAL  TABLES  CHANGED:  none  * 

*  MODULES  CALLED:  ibeta  * 

*  CALLING  MODULES:  newton  * 

*  * 

**************************************************************** 


function  betap(alpha, p, q  :  real):real; 
var  j  j , jend, j , i  :  integer j 

dp,dl,dif ,dlx,dux,dmp,decr,dmpu  :  reals 
dm,dn,du,dfd,dabf ,dfune  :  reals 
flag  :  boolean; 
espl , esp2 , esp3, esp4  :  reals 
darg.dfun  :  array LI.. 4]  of  reals 
begin 

espl  :='1.0e-l80j  esp2  :=  1 .Oe— 13? 
esp3  :=  l.Oe-llj  esp4  :=  1.0e-10{ 
dp  :=  alphas  dm  :=  p; 
dn  : =  q j  du  : =  l.Os 
flag  i=  true; 

if  (( (dp*(du-dp) ) <0) or(dm<0)or(dn<0) ) 
then  dmp  :  =  0.0 
else  if  ( (dp*(du-dp) ) =0) 
then  dmp  :=  alpha 
else  if  (dm  =  1.0) 

then  dmp  :=  du-exp( (du/dn) *ln(du-dp) ) 
else  if  (dn  =  1.0) 

then  dmp  :=  exp(du/dm)*ln(dp) ) 
else  flag  :=  false; 
if  (not  flag) 
then  begin 

dl  :=  0.0;  dif  :=  1. 0/3.0; 
dlx  :=  -dp;  dux  :=  du-dp; 
jj  »■  °S 
dmpu  : =  0.0s 
jend  :=  3s 

while  ( ( j j<25)  and  (not  flag))  do 
begin 

if  jj  =  25  then  jend  :*  3s 
jj  '=  jj  +  Is 
j  :=  Is 

while  ((j<=jend)  and  (not  flag))  io 
begin 

imp  s=  (du+dl )/2.0; 

i  :  =  1  ; 

d3 


if  ( (du-dl)<espl ) 
then  flag  :=  true 

else  if  ( ( (du-dl)<(esp2*dp) )and(dl >esp2) ) 
then  flag  «=  true 

else  while  ((i<3)  and  (not  flag))  do 
begin 

darg[i]  t=  dl+(du-dl)*dif*i j 
dfun[i]  :=  ibeta(dm, dn, drg[ i ] ) -dp s 
if  (dfun[i]=0)  then  dnp  j=  darg[i]j 
if  (dfun[i]=0)  then  flag  !=  true 
else  if  ( (dfun[i]<0.0)and(i=2) ) 
then  begin 

dl  »=  darg[2];  dlx  «=  dfun[2] 
end 

else  if  (dfun[i]>0.0) 
then  begin 

du  j=  darg[i]{  dux  «=  dfun[i]j 
if  (i  =  2) 
then  begin 

dl«=darg[l]j  dlx  «=  dfun[lj 
end 

else  i  :=  2 
end  5 

i  s  =  i  +  1 

end } 

j  '«=  j  +  1 

end ; 

if  (not  flag) 
then  begin 

jend  i=  2;  dmp  :=  (du+dl)/2.0j 
dfd  :=  dux-dlxj 

if  ( (dfd<esp3)and(dfd<esp4*dp) ) ) 
then  flag  j=  true 
end  j 

if  (not  flag) 
then  begin 

deer  i=  dux  *  (du-dl)/dfdj  dmp  :=  du  -  deer; 
if  ( ( (dnp-dl) <espl )or( ( (dnp-dl<esp2) 

and(dl?esp2) ) )  then  flag  t=  true 

end ; 

if  (not  flag) 
then  begin 

dfun(3]  »=  ibeta(dmfdn,dmp)-dp! 
dabf  «=  abs(dfun[3 ]) i  dfune  :=  dfun[3J} 
if  ( ( (dabf<esp3)and(dabf<(esp4*dp)-)  )or 
(dmp  <espl )or( ( (du-dmpu)<esp2)and 
(du >0. 999999999999 ) )or(dfun[3]=0.0)) 
then  flag  i-  true 

end  j 

if  (not  flag) 
then  begin 

if  (dfun[3]<0.0) 

then  begin 


if  (decr<(0.9*(du-dl))) 
then  begin 

dl  :  =  dmp;  dlx  :  =  dfune 
end 

else  begin 

dmpu  i=  dmpj  dmp  :=  5. 0* '  •- ;  -dl 
dfune  1=  ibeta(dm,dn,dmp)-^:.  ; 
if  (dfune  =  0.0) 
then  flag  i=  true; 
if  (not  flag) 
then  begin 

if  (dfune  <  0.0) 
then  begin 

dl  1=  dmp;  dlx  1=  dfune 
end 

else  begin 

du  :  =  dmp;  dux  »=  dfune 
dl  1=  dmpu;  dlx  1  =  dfun 
end 

end 

end 

end 

else  begin 

if  (decr>=(0, l*(du-dl) ) ) 
then  begin 

du  s=  dmp;  dux  :=  dfune 
end 

else  begin 
dmpu  :=  dmp; 
dmp  j=  du-5 • 0*decr; 
dfune  ;=  ibeta(dm, dn, dmp) -dp 5 
if  (dfune=0.0)  tnen  flag  s=  true 
if  (not  flag) 
then  begin 

if  (dfune^O.G) 
then  begin 
du  «=  dmpu;_ 
dux  :=  dfun[3j; 
dl  i =  dmp ; 
dlx  t=  dfune 
end 

else  begin 
du  j  =  dmp ; 
dux  »=  dfune 


* 


* 


NAME:  newton 

FUNCTION:  calculates  the  percentage  point  using 
newton's  approximation 
INPUTS:  a, v,p, 1, alpha, n, limit 
OUTPUTS:  newton 
GLOBAL  VARIABLES  USED:  none 
GLOBAL  VARIABLES  CHANGED:  none 
GLOBAL  TABLES  USED:  none 
GLOBAL  TABLES  CHANGED:  none 
MODULES  CALLED:  betap,  prob 
CALLING  MODULES:  thesis 


* 

* 

* 

* 

* 

* 

* 

* 

* 

* 

* 

* 


function  newton(a,v,p, 1, alpha, n  :  real;  limit  :  integer) : real ; 
var  z,pa  :  array[l..25]  of  real; 
sp  :  real; 
k  :  integer; 
done  :  boolean; 
begin  . 

t[1]  :=  betap (alpha, (n-(p+l)/(6*p)),v); 

:-b  P  <  3 

then  z [ 2 ]  :=  z[l]  +  0.05 
else  z[2]  :=  z[l]  -  O.O5 
x  :=  z[l]j 

sp  :  =  prob(a, v, p, 1, x,n, limit ) ; 

•...■•..1]  :=  sp; 
k  :=  2; 

Hone  :=  false; 

while (  (k<25)and(sp<  =  1.0)and(sp>=0.0)and(not  done))  do 
begin 

x  :=  z[k] ; 

sp  :=  prob(a,v,p, 1, x,n, limit)  ; 
pa[k]  :=  sp; 

if  (abs( sp-alpha)  0.0000001) 
then  done  :=  true 

else  z[k+l]  :=  z [ k]- ( z[ k j- z [ k-  1  ] ) * 

( pa[k] -alpha )/( paf  kJ-paLk-  1  ] )  ; 
k  :=  k  +  1 
end ; 

if  ((sp>1.0)  or  (sp<(0.0)) 

then  writeln( ' incorrect  prob  value ', x, sp) ; 
newton  :=  z[k-l] 
end ; 


^6 


i 


* 

* 

# 

* 

* 

* 

* 

* 

* 

* 

* 

* 


NAME i  thesis 

FUNCTION t  manages  overall  system 
INPUTS t  none 
OUTPUTS:  none 

GLOBAL  VARIABLES  USED:  none 
GLOBAL  VARIABLES  CHANGED:  bflag 
GLOBAL  TABLES  USED:  none 
GLOBAL  TABLES  CHANGED:  none 

MODULES  CALLED:  get real,  lmodule,  modR,  prob,  newton 

none 


* 

* 

* 

* 

* 

* 

* 

* 

# 

* 

* 

* 


program  thesis  ( input , output ) j 

type  string  =  packed  array[1..5]  of  char; 
numary  =  array [O..30J  of  real; 
numary2  =  array[0. . 30, 0. . 30 ]  of  real; 
var  filename  t  string; 

bflag, flag  i  boolean; 

l,n, t , x, a, v, p, prb, tsum, alpha, s  ;  reals 

limit , i , j , k, r, num  i  integer;  barray , R, Ar , Q  i  numary; 

Ajr.Cjr  «  numary2;  infile  t  text; 

begin 

flag  i=  false;  bflag  «=  false; 
writeln(  '  enter  number  of  samples’); 
getreal( input , p, flag) ; 
while  (flag)  do 
ce  gin 

writeln( ' error  enter  number'); 
getreal( input , p, f lag) 
end  ; 

writeln( ' enter  sample  size'); 
gefreal(input,n,flag) ; 
while  (flag)  do 
begin 

writeln( ' error  enter  number’); 
getreal( input , n, flag) 
end ; 

writeln( ’ enter  tthe  file  name  where  the  data  is  stored'); 
v,riteln( '  or  "input"  for  entering  the  data  interactivly '  ) ; 
1  :=  1;  filenane[5]  :=  '  ';  readln; 
wnile  (i<6)  and  (not  eoln)  do 
begin 

read ( filename [i J ) ; 
i  »=!-*-  1 
end ; 

if  filename  <)  'input' 

then  reset ( infile , filename ) ; 

1  :=  lmodule ( filename , n, p) ; 


3? 


if  (n  >=  20) 
then  num  »=  10 
else  if  ((n-p)>3.0) 

then  num  i =  20 
else  num  .•  =  22; 

j  t  =  num ; 
r  i=  num; 
t  :=  (p+l)/6*p)» 

V  :=  (p- 1 )/2  5 
a  :=  (l-v)/2; 
modR( j,r,a,v,p,t) ; 

X  j  =  1.0; 
prob  :=  0.0; 
limit  j=  0; 

while  ( (abs (prb- 1 . 0 )>0 . 0000001 )and ( limit <num) )  do 
begin 

limit  :  =  limit  +  1; 

prb  :=  prob(a,v,p, t,x,n, limit) 

end ; 

alpha  ;  =  1.0; 
if  (limit  >=  num) 

then  writeln('too  few  terms,  change  number  of  terms’) 
else  begin 

writeln( ' enter  alpha  level  to  be  tested,  0  to  end’); 
ge t real ( input , alpha, flag) ; 
while  flag  do 
begin 

writeln( 'enter  alpha  level  to  be  tested,  0  to  end'); 

getreaK input , alpha,  flag) 

end 

end  ; 

.;.ila  (alpha  <  >  0)  do 
begin 

prb  :=  newton(a, v, p, t .alpha, n, limit ) ; 
if  (prb  >1) 

then  writeln( ' re ject  hypothesis  that  all  data’) 
else  writeln( ' cann’ t  reject  hypothesis  that  all  data'); 
writeln('  comes  from  the  same  sample'); 
writeln( ' enter  alpha  level  to  be  tested,  0  to  end'); 
get real ( input .alpha, flag)  ; 
while  flag  do 
begin 

writel.n(  ’  enter  alpha  level  to  be  tested,  0  to  end'); 

getreal ( input , alpha, flag) 

er.d 

end 
end 
end . 
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Test  Plan 
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Appendix  C 
User’s  Guide 


This  appendix  will  first  show  a  simple  example  of  how  to 
run  the  thesis  program.  The  computer  prompts  will  then  be  all 
explained.  Finally,  the  option  for  entering  the  data 
interactively  will  be  further  explained.  In  the  following 
example  the  user  responses  are  designated  by  a  H+  ",  although 
the  prompt  will  not  be  seen  by  the  user  when  running  the  thesis 
program. 

enter  number  of  samples 

+  3 

enter  sample  size 

+  7 

enter  the  file  name  where  the  data  is  stored 
or  "input"  for  entering  the  data  interactively 

+  data  - 

enter  alpha  level  to  be  tested,  0  to  end 

+  0.05 

reject  hypothesis  that  all  the  data 
comes  from  the  same  population 

enter  alpha  level  to  be  tested,  0  to  end 


The  first  two  prompts  ask  for  the  number  of  samples,  and 
the  size  of  each  sample  to  be  tested.  Each  sample  must  have  the 
same  number  of  data  points  in  it.  Th»  total  number  of  data 
points  the  program  expects  can  be  found  by  multiplying  the 
number  of  sample  times  the  sample  size. 

The  next  prompt  tells  the  program  where  to  find  the  test 
data.  If  the  user  enters  a  name  other  than  "input"  the  name  is 
assumed  to  be  the  name  of  the  file  where  all  the  data  points  are 
stored.  When  entering  a  file  name  there  are  three  possible  next 
steps  for  the  program.  First,  if  the  file  doesn’t  exist  the 
program  will  abend  will  a  file  not  found  message.  Next,  if  the 
file  does  exist  but  there  are  not  enough  data  points  in  the  file 
or  non  numeric  symbols  are  found  the  program  will  end  with  a  bad 
data  message.  The  last,  and  the  one  looked  for,  alternative  is 
that  the  program  found  the  file  and  enough  data  points  and  then 
the  program  continues  by  giving  the  next  prompt.  The  number  of 
data  points  needed  is  discussed  in  the  previous  paragraph.  If 
more  data  points  are  given  then  required  the  excess  will  be 
ignored.  The  response  "input"  will  be  examined  in  detail  later 
in  this  appendix. 

The  program  can  abend  with  one  other  message  which  is  "too 
few  terms,  change  number  of  terms."  The  last  message  indicates 
to  the  user  that  a  theoretical  value  can  not  be  calculated  for 
the  number  of  samples  and  sample  size  numbers  given  because  of 


the  limitation  of  Pascal  using  single  precision.  The  problem 
can  be  overcome  by  taking  fewer  samples  or  increasing  the  sample 
size.  One  attempt  at  fixing  this  problem  is  to  convert  the 
program  to  a  language  that  has  double  precision  such  as  C. 

The  last  prompt  before  the  results  are  determined  is  to 
enter  the  alpha  level.  The  program  will  continue  to  prompt  for 
different  alpha  levels  to  be  tested  until  a  zero  (0)  is  entered 
ending  the  loop  and  the  program. 

The  above  example  shows  one  of  the  two  possible  results  of 
the  thesis. program* s  calculation  which  is  "reject  hypothesis 
that  all  the  data  comes  from  the  same  sample."  The  other 
possible  result  is  "can't  reject  hypothesis  that  all  the  data 
comes  from  the  same  sample." 

The  response  "input"  to  the  file  name  prompt  allows  the 
user  to  enter  the  data  interactively.  The  user  will  be  prompted 
by  a  number  followed  by  a  period,  which  indicates  the  nth  number 
within  the  particular  sample.  The  prompt  number  runs  from  one 
to  the  sample  size  for  each  sample.  If  the  user  enters  anything 
except  a  positive  real  number  the  program  will  prompt  the  same 
again  indicating  the  character  string  was  ignored.  - 
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